mvrnorm(来自 MASS)与 rmvnorm(来自 mvtnorm)

mvrnorm (from MASS) vs rmvnorm (from mvtnorm)

我正在从多元正态分布生成大量数据用于模拟。我想知道是否有人知道哪个命令对此最有效。如果是 mvrnorm(来自 "MASS" 包)或 rmvnorm(来自 "mvtnorm" 包)。

此类问题可以通过计时不同的方法轻松回答。让

library(microbenchmark)
library(MASS)
library(mvtnorm)

n <- 10000
k <- 50
mu <- rep(0, k)
rho <- 0.2
Sigma <- diag(k) * (1 - rho) + rho 

这样我们就有了 50 个单位方差和相关系数为 0.2 的变量。生成 10000 个观测值,我们得到

microbenchmark(mvrnorm(n, mu = mu, Sigma = Sigma),
               rmvnorm(n, mean = mu, sigma = Sigma, method = "eigen"),
               rmvnorm(n, mean = mu, sigma = Sigma, method = "svd"),
               rmvnorm(n, mean = mu, sigma = Sigma, method = "chol"),
               times = 100)
# Unit: milliseconds
#                                                    expr      min       lq     mean   median        uq      max neval cld
#                      mvrnorm(n, mu = mu, Sigma = Sigma) 65.04667 73.02912 85.30384 81.70611  92.69137 148.6959   100  a 
#  rmvnorm(n, mean = mu, sigma = Sigma, method = "eigen") 71.14170 81.08311 95.12891 88.84669 100.62174 237.0012   100   b
#    rmvnorm(n, mean = mu, sigma = Sigma, method = "svd") 71.32999 81.30640 93.40939 88.54804 104.00281 208.3690   100   b
#   rmvnorm(n, mean = mu, sigma = Sigma, method = "chol") 71.22712 78.59898 94.13958 89.04653 108.27363 158.7890   100   b

因此,可能 mvrnorm 表现稍好。当您考虑特定的应用程序时,您应该将 nkSigma 设置为更适合该应用程序的值。

由于您似乎并不局限于这两种方法,您可以研究 Rcpp 替代方案;参见,例如 1, 2, 3.