Starting with version 1.2, cubature
now uses Rcpp
. Also, version 1.3 uses the newer version (1.0.2) of Steven G. Johnson’s hcubature
and pcubature
routines, including the vectorized interface.
Per the documentation, use of pcubature
is advisable only for smooth integrands in dimesions up to three at most. In fact, the pcubature
routines perform significantly worse than the vectorized hcubature
in inappropriate cases. So when in doubt, you are better off using hcubature
.
The main point of this note is to examine the difference vectorization makes. My recommendations are below in the summary section.
Our harness will provide timing results for hcubature
, pcubature
(where appropriate) and R2Cuba
calls. We begin by creating a harness for these calls.
loadedSuggested <- c(benchr = FALSE, R2Cuba = FALSE)
if (requireNamespace("benchr", quietly = TRUE)) {
loadedSuggested["benchr"] <- TRUE
}
if (requireNamespace("R2Cuba", quietly = TRUE)) {
loadedSuggested["R2Cuba"] <- TRUE
}
library(cubature)
harness <- function(which = NULL,
f, fv, lowerLimit, upperLimit, tol = 1e-3, times = 20, ...) {
fns <- c(hc = "Non-vectorized Hcubature",
hc.v = "Vectorized Hcubature",
pc = "Non-vectorized Pcubature",
pc.v = "Vectorized Pcubature")
if (loadedSuggested["R2Cuba"]) {
fns <- c(fns, cc = "R2Cuba::cuhre")
cc <- function() R2Cuba::cuhre(ndim = ndim, ncomp = 1, integrand = f,
lower = lowerLimit, upper = upperLimit,
flags = list(verbose = 0, final = 1),
rel.tol = tol,
max.eval = 10^6,
...)
}
hc <- function() cubature::hcubature(f = f,
lowerLimit = lowerLimit,
upperLimit = upperLimit,
tol = tol,
...)
hc.v <- function() cubature::hcubature(f = fv,
lowerLimit = lowerLimit,
upperLimit = upperLimit,
tol = tol,
vectorInterface = TRUE,
...)
pc <- function() cubature::pcubature(f = f,
lowerLimit = lowerLimit,
upperLimit = upperLimit,
tol = tol,
...)
pc.v <- function() cubature::pcubature(f = fv,
lowerLimit = lowerLimit,
upperLimit = upperLimit,
tol = tol,
vectorInterface = TRUE,
...)
ndim = length(lowerLimit)
if (is.null(which)) {
fnIndices <- seq_along(fns)
} else {
fnIndices <- na.omit(match(which, names(fns)))
}
fnList <- lapply(names(fns)[fnIndices], function(x) call(x))
if (loadedSuggested["benchr"]) {
argList <- c(fnList, times = times, progress = FALSE)
result <- do.call(benchr::benchmark, args = argList)
d <- summary(result)[seq_along(fnIndices), ]
d$expr <- fns[fnIndices]
d
} else {
d <- data.frame(expr = names(fns)[fnIndices], timing = NA)
}
}
We reel off the timing runs.
func <- function(x) sin(x[1]) * cos(x[2]) * exp(x[3])
func.v <- function(x) {
matrix(apply(x, 2, function(z) sin(z[1]) * cos(z[2]) * exp(z[3])), ncol = ncol(x))
}
d <- harness(f = func, fv = func.v,
lowerLimit = rep(0, 3),
upperLimit = rep(1, 3),
tol = 1e-5,
times = 100)
knitr::kable(d, digits = 3, row.names = FALSE)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 100 | 0.002 | 0.003 | 0.003 | 0.003 | 0.003 | 0.005 | 0.283 | 6.17 |
Vectorized Hcubature | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.046 | 1.00 |
Non-vectorized Pcubature | 100 | 0.008 | 0.008 | 0.009 | 0.009 | 0.009 | 0.043 | 0.928 | 20.00 |
Vectorized Pcubature | 100 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.002 | 0.127 | 2.86 |
Using cubature
, we evaluate \[
\int_R\phi(x)dx
\] where \(\phi(x)\) is the three-dimensional multivariate normal density with mean 0, and variance \[
\Sigma = \left(\begin{array}{rrr}
1 &\frac{3}{5} &\frac{1}{3}\\
\frac{3}{5} &1 &\frac{11}{15}\\
\frac{1}{3} &\frac{11}{15} & 1
\end{array}
\right)
\] and \(R\) is \([-\frac{1}{2}, 1] \times [-\frac{1}{2}, 4] \times [-\frac{1}{2}, 2].\)
We construct a scalar function (my_dmvnorm
) and a vector analog (my_dmvnorm_v
). First the functions.
m <- 3
sigma <- diag(3)
sigma[2,1] <- sigma[1, 2] <- 3/5 ; sigma[3,1] <- sigma[1, 3] <- 1/3
sigma[3,2] <- sigma[2, 3] <- 11/15
logdet <- sum(log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values))
my_dmvnorm <- function (x, mean, sigma, logdet) {
x <- matrix(x, ncol = length(x))
distval <- stats::mahalanobis(x, center = mean, cov = sigma)
exp(-(3 * log(2 * pi) + logdet + distval)/2)
}
my_dmvnorm_v <- function (x, mean, sigma, logdet) {
distval <- stats::mahalanobis(t(x), center = mean, cov = sigma)
exp(matrix(-(3 * log(2 * pi) + logdet + distval)/2, ncol = ncol(x)))
}
Now the timing.
d <- harness(f = my_dmvnorm, fv = my_dmvnorm_v,
lowerLimit = rep(-0.5, 3),
upperLimit = c(1, 4, 2),
tol = 1e-5,
times = 10,
mean = rep(0, m), sigma = sigma, logdet = logdet)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 10 | 0.924 | 1.011 | 1.075 | 1.078 | 1.114 | 1.329 | 10.781 | 594.0 |
Vectorized Hcubature | 10 | 0.002 | 0.003 | 0.003 | 0.003 | 0.003 | 0.004 | 0.029 | 1.6 |
Non-vectorized Pcubature | 10 | 0.398 | 0.475 | 0.487 | 0.497 | 0.521 | 0.590 | 4.969 | 270.0 |
Vectorized Pcubature | 10 | 0.001 | 0.001 | 0.002 | 0.002 | 0.002 | 0.003 | 0.018 | 1.0 |
The effect of vectorization is huge. So it makes sense for users to vectorize the integrands as much as possible for efficiency.
Furthermore, for this particular example, we expect mvtnorm::pmvnorm
to do pretty well since it is specialized for the multivariate normal. The good news is that the vectorized versions of hcubature
and pcubature
are quite competitive if you compare the table above to the one below.
library(mvtnorm)
g1 <- function() mvtnorm::pmvnorm(lower = rep(-0.5, m),
upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
alg = Miwa(), abseps = 1e-5, releps = 1e-5)
g2 <- function() mvtnorm::pmvnorm(lower = rep(-0.5, m),
upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
alg = GenzBretz(), abseps = 1e-5, releps = 1e-5)
g3 <- function() mvtnorm::pmvnorm(lower = rep(-0.5, m),
upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
alg = TVPACK(), abseps = 1e-5, releps = 1e-5)
knitr::kable(summary(benchr::benchmark(g1(), g2(), g3(), times = 20, progress = FALSE)),
digits = 3, row.names = FALSE)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
g1() | 20 | 0.001 | 0.003 | 0.003 | 0.003 | 0.003 | 0.006 | 0.057 | 1.03 |
g2() | 20 | 0.001 | 0.002 | 0.003 | 0.003 | 0.003 | 0.005 | 0.052 | 1.01 |
g3() | 20 | 0.001 | 0.002 | 0.003 | 0.003 | 0.003 | 0.004 | 0.052 | 1.00 |
testFn0 <- function(x) prod(cos(x))
testFn0_v <- function(x) matrix(apply(x, 2, function(z) prod(cos(z))), ncol = ncol(x))
d <- harness(f = testFn0, fv = testFn0_v,
lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 1000)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 1000 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.217 | 2.84 |
Vectorized Hcubature | 1000 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.076 | 1.00 |
Non-vectorized Pcubature | 1000 | 0 | 0 | 0 | 0 | 0 | 0.003 | 0.307 | 4.07 |
Vectorized Pcubature | 1000 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.146 | 1.93 |
testFn1 <- function(x) {
val <- sum(((1 - x) / x)^2)
scale <- prod((2 / sqrt(pi)) / x^2)
exp(-val) * scale
}
testFn1_v <- function(x) {
val <- matrix(apply(x, 2, function(z) sum(((1 - z) / z)^2)), ncol(x))
scale <- matrix(apply(x, 2, function(z) prod((2 / sqrt(pi)) / z^2)), ncol(x))
exp(-val) * scale
}
d <- harness(f = testFn1, fv = testFn1_v,
lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 10)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 10 | 0.015 | 0.016 | 0.018 | 0.020 | 0.019 | 0.047 | 0.205 | 95.70 |
Vectorized Hcubature | 10 | 0.004 | 0.004 | 0.004 | 0.004 | 0.004 | 0.005 | 0.042 | 22.10 |
Non-vectorized Pcubature | 10 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 2.12 |
Vectorized Pcubature | 10 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 1.00 |
testFn2 <- function(x) {
radius <- 0.50124145262344534123412
ifelse(sum(x * x) < radius * radius, 1, 0)
}
testFn2_v <- function(x) {
radius <- 0.50124145262344534123412
matrix(apply(x, 2, function(z) ifelse(sum(z * z) < radius * radius, 1, 0)), ncol = ncol(x))
}
d <- harness(which = c("hc", "hc.v", "cc"),
f = testFn2, fv = testFn2_v,
lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 10)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 10 | 0.202 | 0.248 | 0.251 | 0.246 | 0.251 | 0.258 | 2.458 | 4.96 |
Vectorized Hcubature | 10 | 0.043 | 0.050 | 0.051 | 0.050 | 0.051 | 0.057 | 0.503 | 1.00 |
testFn3 <- function(x) prod(2 * x)
testFn3_v <- function(x) matrix(apply(x, 2, function(z) prod(2 * z)), ncol = ncol(x))
d <- harness(f = testFn3, fv = testFn3_v,
lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 50)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 50 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.024 | 4.40 |
Vectorized Hcubature | 50 | 0 | 0 | 0 | 0 | 0 | 0.000 | 0.006 | 1.07 |
Non-vectorized Pcubature | 50 | 0 | 0 | 0 | 0 | 0 | 0.000 | 0.019 | 3.62 |
Vectorized Pcubature | 50 | 0 | 0 | 0 | 0 | 0 | 0.000 | 0.005 | 1.00 |
testFn4 <- function(x) {
a <- 0.1
s <- sum((x - 0.5)^2)
((2 / sqrt(pi)) / (2. * a))^length(x) * exp (-s / (a * a))
}
testFn4_v <- function(x) {
a <- 0.1
r <- apply(x, 2, function(z) {
s <- sum((z - 0.5)^2)
((2 / sqrt(pi)) / (2. * a))^length(z) * exp (-s / (a * a))
})
matrix(r, ncol = ncol(x))
}
d <- harness(f = testFn4, fv = testFn4_v,
lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 20)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 20 | 0.007 | 0.008 | 0.008 | 0.008 | 0.008 | 0.013 | 0.167 | 5.92 |
Vectorized Hcubature | 20 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.002 | 0.028 | 1.00 |
Non-vectorized Pcubature | 20 | 0.011 | 0.012 | 0.012 | 0.014 | 0.013 | 0.049 | 0.286 | 8.75 |
Vectorized Pcubature | 20 | 0.002 | 0.002 | 0.002 | 0.002 | 0.002 | 0.003 | 0.045 | 1.49 |
testFn5 <- function(x) {
a <- 0.1
s1 <- sum((x - 1 / 3)^2)
s2 <- sum((x - 2 / 3)^2)
0.5 * ((2 / sqrt(pi)) / (2. * a))^length(x) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
}
testFn5_v <- function(x) {
a <- 0.1
r <- apply(x, 2, function(z) {
s1 <- sum((z - 1 / 3)^2)
s2 <- sum((z - 2 / 3)^2)
0.5 * ((2 / sqrt(pi)) / (2. * a))^length(z) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
})
matrix(r, ncol = ncol(x))
}
d <- harness(f = testFn5, fv = testFn5_v,
lowerLimit = rep(0, 2), upperLimit = rep(1, 2), times = 20)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 20 | 0.017 | 0.018 | 0.019 | 0.020 | 0.021 | 0.023 | 0.392 | 6.84 |
Vectorized Hcubature | 20 | 0.003 | 0.003 | 0.004 | 0.004 | 0.004 | 0.005 | 0.081 | 1.46 |
Non-vectorized Pcubature | 20 | 0.012 | 0.013 | 0.013 | 0.014 | 0.015 | 0.016 | 0.275 | 4.89 |
Vectorized Pcubature | 20 | 0.002 | 0.003 | 0.003 | 0.003 | 0.003 | 0.004 | 0.057 | 1.00 |
testFn6 <- function(x) {
a <- (1 + sqrt(10.0)) / 9.0
prod( a / (a + 1) * ((a + 1) / (a + x))^2)
}
testFn6_v <- function(x) {
a <- (1 + sqrt(10.0)) / 9.0
r <- apply(x, 2, function(z) prod( a / (a + 1) * ((a + 1) / (a + z))^2))
matrix(r, ncol = ncol(x))
}
d <- harness(f = testFn6, fv = testFn6_v,
lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 20)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 20 | 0.010 | 0.011 | 0.011 | 0.011 | 0.012 | 0.013 | 0.225 | 6.21 |
Vectorized Hcubature | 20 | 0.002 | 0.002 | 0.002 | 0.002 | 0.002 | 0.006 | 0.041 | 1.00 |
Non-vectorized Pcubature | 20 | 0.053 | 0.055 | 0.055 | 0.060 | 0.060 | 0.092 | 1.192 | 31.80 |
Vectorized Pcubature | 20 | 0.007 | 0.008 | 0.008 | 0.008 | 0.009 | 0.011 | 0.167 | 4.52 |
testFn7 <- function(x) {
n <- length(x)
p <- 1/n
(1 + p)^n * prod(x^p)
}
testFn7_v <- function(x) {
matrix(apply(x, 2, function(z) {
n <- length(z)
p <- 1/n
(1 + p)^n * prod(z^p)
}), ncol = ncol(x))
}
d <- harness(f = testFn7, fv = testFn7_v,
lowerLimit = rep(0, 3), upperLimit = rep(1, 3), times = 20)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 20 | 0.022 | 0.022 | 0.024 | 0.025 | 0.026 | 0.030 | 0.497 | 6.89 |
Vectorized Hcubature | 20 | 0.003 | 0.003 | 0.004 | 0.004 | 0.004 | 0.006 | 0.079 | 1.00 |
Non-vectorized Pcubature | 20 | 0.053 | 0.055 | 0.057 | 0.060 | 0.065 | 0.074 | 1.198 | 16.10 |
Vectorized Pcubature | 20 | 0.007 | 0.008 | 0.009 | 0.009 | 0.010 | 0.015 | 0.187 | 2.48 |
I.1d <- function(x) {
sin(4 * x) *
x * ((x * ( x * (x * x - 4) + 1) - 1))
}
I.1d_v <- function(x) {
matrix(apply(x, 2, function(z)
sin(4 * z) *
z * ((z * ( z * (z * z - 4) + 1) - 1))),
ncol = ncol(x))
}
d <- harness(f = I.1d, fv = I.1d_v,
lowerLimit = -2, upperLimit = 2, times = 100)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 100 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.002 | 0.126 | 5.54 |
Vectorized Hcubature | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.027 | 1.13 |
Non-vectorized Pcubature | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.042 | 1.85 |
Vectorized Pcubature | 100 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.023 | 1.00 |
I.2d <- function(x) {
x1 <- x[1]; x2 <- x[2]
sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
}
I.2d_v <- function(x) {
matrix(apply(x, 2,
function(z) {
x1 <- z[1]; x2 <- z[2]
sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
}),
ncol = ncol(x))
}
d <- harness(f = I.2d, fv = I.2d_v,
lowerLimit = rep(-1, 2), upperLimit = rep(1, 2), times = 100)
knitr::kable(d, digits = 3)
expr | n.eval | min | lw.qu | median | mean | up.qu | max | total | relative |
---|---|---|---|---|---|---|---|---|---|
Non-vectorized Hcubature | 100 | 0.040 | 0.044 | 0.046 | 0.047 | 0.048 | 0.079 | 4.707 | 56.30 |
Vectorized Hcubature | 100 | 0.005 | 0.006 | 0.006 | 0.006 | 0.007 | 0.009 | 0.644 | 7.70 |
Non-vectorized Pcubature | 100 | 0.003 | 0.004 | 0.004 | 0.004 | 0.004 | 0.005 | 0.399 | 4.75 |
Vectorized Pcubature | 100 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.002 | 0.087 | 1.00 |
About the only real modification we have made to the underlying cubature-1.0.2
library is that we use M = 16
rather than the default M = 19
suggested by the original author for pcubature
. This allows us to comply with CRAN package size limits and seems to work reasonably well for the above tests. Future versions will allow for such customization on demand.
Cuba
libraryThe package R2Cuba
provides a suite of cubature and other useful Monte Carlo integration routines linked against version 1.6 of the C library. The authors of R2Cuba
have obviously put a lot of work has into it since it uses C-style R API. This approach also means that it is harder to keep the R package in sync with new versions of the underlying C library. In fact, the Cuba C library has marched on now to version 4.2.
In a matter of a couple of hours, I was able to link the latest version (4.2) of the Cuba libraries with R using Rcpp; you can see it on the Cuba
branch of my Github repo. This branch package builds and installs in R on my Mac and Ubuntu machines and gives correct answers at least for cuhre
. The 4.2 version of the Cuba library also has vectorized versions of the routines that can be gainfully exploited (not implemented in the branch). As of this writing, I have also not yet carefully considered the issue of parallel execution (via fork()
) which might be problematic in the Windows version. In addition, my timing benchmarks showed very disappointing results.
For the above reasons, I decided not to bother with Cuba for now, but if there is enough interest, I might consider rolling Cuba-4.2+
into this cubature
package in the future.
The following is therefore my recommendation.
Vectorize your function. The time spent in so doing pays back enormously. This is easy to do and the examples above show how.
Vectorized hcubature
seems to be a good starting point.
For smooth integrands in low dimensions (\(\leq 3\)), pcubature
might be worth trying out. Experiment before using in a production package.