尾声 gelman.diag(): "Error in chol.default(W) : the leading minor of order nn is not positive definite"
coda gelman.diag(): "Error in chol.default(W) : the leading minor of order nn is not positive definite"
coda 包中的 gelman.diag()
函数在计算多变量潜在比例缩减因子 (MPSRF) 时抛出错误。
> load("short_mcmc_list.rda")
> niter(short.mcmc.list)
[1] 100
> nvar(short.mcmc.list)
[1] 200
> nchain(short.mcmc.list)
[1] 2
>
> coda::gelman.diag(
short.mcmc.list,
autoburnin = FALSE,
multivariate = TRUE
)
Error in chol.default(W) :
the leading minor of order 199 is not positive definite
这个错误是什么意思?
此问题之前发布在 R coda "The leading minor of order 3 is not positive definite"。主要结论是:"Conclusion: There seems to be something wrong with getting the multivariate estimate of the Gelman-Rubin diagnostic. Setting multivariate = FALSE fixes the problem and outputs a single variable estimate for each variable." 已有 6 年历史,因此答案可能已过时。
在gelman.diag()
中,MPSRF的计算公式为:
> coda::gelman.diag <-
function (x, confidence = 0.95, transform = FALSE, autoburnin = FALSE,
multivariate = TRUE)
{
#A lot of code ...
Niter <- niter(x)
Nchain <- nchain(x)
Nvar <- nvar(x)
#W is the variance matrix of the chain
S2 <- array(sapply(x, var, simplify = TRUE), dim = c(Nvar,
Nvar, Nchain))
W <- apply(S2, c(1, 2), mean)
#A lot of code ...
if (Nvar > 1 && multivariate) {
CW <- chol(W)
#This is max eigenvalue from W^-1*B.
emax <- eigen(
backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
symmetric = TRUE, only.values = TRUE)$values[1]
}
我编辑了 gelman.diag()
,删除了导致错误的 Cholesky 分解,并将 W 和 B 添加到要返回的列表中。这让我明白了为什么 W
不能进行 Cholesky 分解。
my.gelman.diag <- function(x,
confidence = 0.95,
transform = FALSE,
autoburnin = FALSE,
multivariate = TRUE
){
x <- as.mcmc.list(x)
if (nchain(x) < 2)
stop("You need at least two chains")
if (autoburnin && start(x) < end(x)/2)
x <- window(x, start = end(x)/2 + 1)
Niter <- niter(x)
Nchain <- nchain(x)
Nvar <- nvar(x)
xnames <- varnames(x)
if (transform)
x <- gelman.transform(x)
x <- lapply(x, as.matrix)
S2 <- array(sapply(x, var, simplify = TRUE),
dim = c(Nvar, Nvar, Nchain)
)
W <- apply(S2, c(1, 2), mean)
xbar <- matrix(sapply(x, apply, 2, mean, simplify = TRUE),
nrow = Nvar, ncol = Nchain)
B <- Niter * var(t(xbar))
if (Nvar > 1 && multivariate) { #ph-edits
# CW <- chol(W)
# #This is W^-1*B.
# emax <- eigen(
# backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
# symmetric = TRUE, only.values = TRUE)$values[1]
emax <- 1
mpsrf <- sqrt((1 - 1/Niter) + (1 + 1/Nvar) * emax/Niter)
} else {
mpsrf <- NULL
}
w <- diag(W)
b <- diag(B)
s2 <- matrix(apply(S2, 3, diag), nrow = Nvar, ncol = Nchain)
muhat <- apply(xbar, 1, mean)
var.w <- apply(s2, 1, var)/Nchain
var.b <- (2 * b^2)/(Nchain - 1)
cov.wb <- (Niter/Nchain) * diag(var(t(s2), t(xbar^2)) - 2 *
muhat * var(t(s2), t(xbar)))
V <- (Niter - 1) * w/Niter + (1 + 1/Nchain) * b/Niter
var.V <- ((Niter - 1)^2 * var.w + (1 + 1/Nchain)^2 * var.b +
2 * (Niter - 1) * (1 + 1/Nchain) * cov.wb)/Niter^2
df.V <- (2 * V^2)/var.V
df.adj <- (df.V + 3)/(df.V + 1)
B.df <- Nchain - 1
W.df <- (2 * w^2)/var.w
R2.fixed <- (Niter - 1)/Niter
R2.random <- (1 + 1/Nchain) * (1/Niter) * (b/w)
R2.estimate <- R2.fixed + R2.random
R2.upper <- R2.fixed + qf((1 + confidence)/2, B.df, W.df) *
R2.random
psrf <- cbind(sqrt(df.adj * R2.estimate), sqrt(df.adj * R2.upper))
dimnames(psrf) <- list(xnames, c("Point est.", "Upper C.I."))
out <- list(psrf = psrf, mpsrf = mpsrf, B = B, W = W) #added ph
class(out) <- "gelman.diag"
return( out )
}
将my.gelman.diag()
应用到short.mcmc.list
:
> l <- my.gelman.diag(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
> W <- l$W #Within-sequence variance
> B <- l$B #Between-sequence variance
对 W 的调查表明 W 确实是正定的,但其特征值接近 0,因此它失败了。
> evals.W <- eigen(W, only.values = TRUE)$values
> min(evals.W)
[1] 1.980596e-16
确实,增加公差表明W确实是正定的。
> matrixNormal::is.positive.definite(W, tol = 1e-18)
[1] TRUE
所以实际上,W 接近于线性相关...
> eval <- eigen(solve(W)%*%B, only.values = TRUE)$values[1]
Error in solve.default(W) :
system is computationally singular: reciprocal condition number = 7.10718e-21
所以实际上,删除 W 的最后两列使其现在线性 independent/positive 确定。这说明链内有相关参数,可以减少参数个数。
> evals.W[198]
[1] 9.579362e-05
> matrixNormal::is.positive.definite(W[1:198, 1:198])
[1] TRUE
> chol.W <- chol(W)
W 是马尔可夫链的组内方差,衡量状态中每个值与均值的差异程度。如果 W 接近奇异,则方差很小,因此链变化不大。这是一个缓慢移动的链条。用户应使用跟踪图对此进行调查,并可能减少参数数量。链条也可能太短;如果链更长,则链中的值可能足够不同,以至于 W 是线性独立的。
为了避免函数崩溃,我建议使用purrr::possibly()
代替缺失值而不是抛出一个古老的错误。
> pgd <- purrr::possibly(coda::gelman.diag, list(mpsrf = NA), quiet = FALSE)
> pgd(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
Error: the leading minor of order 199 is not positive definite
[1] NA
有关详细信息,请参见 Brooks 和 Gelman,1992 年。
参考:
Gelman, A 和 Rubin, DB (1992) Inference from iterative simulation using multiple sequences, 统计科学, 7, 457-511.
coda 包中的 gelman.diag()
函数在计算多变量潜在比例缩减因子 (MPSRF) 时抛出错误。
> load("short_mcmc_list.rda")
> niter(short.mcmc.list)
[1] 100
> nvar(short.mcmc.list)
[1] 200
> nchain(short.mcmc.list)
[1] 2
>
> coda::gelman.diag(
short.mcmc.list,
autoburnin = FALSE,
multivariate = TRUE
)
Error in chol.default(W) : the leading minor of order 199 is not positive definite
这个错误是什么意思?
此问题之前发布在 R coda "The leading minor of order 3 is not positive definite"。主要结论是:"Conclusion: There seems to be something wrong with getting the multivariate estimate of the Gelman-Rubin diagnostic. Setting multivariate = FALSE fixes the problem and outputs a single variable estimate for each variable." 已有 6 年历史,因此答案可能已过时。
在gelman.diag()
中,MPSRF的计算公式为:
> coda::gelman.diag <-
function (x, confidence = 0.95, transform = FALSE, autoburnin = FALSE,
multivariate = TRUE)
{
#A lot of code ...
Niter <- niter(x)
Nchain <- nchain(x)
Nvar <- nvar(x)
#W is the variance matrix of the chain
S2 <- array(sapply(x, var, simplify = TRUE), dim = c(Nvar,
Nvar, Nchain))
W <- apply(S2, c(1, 2), mean)
#A lot of code ...
if (Nvar > 1 && multivariate) {
CW <- chol(W)
#This is max eigenvalue from W^-1*B.
emax <- eigen(
backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
symmetric = TRUE, only.values = TRUE)$values[1]
}
我编辑了 gelman.diag()
,删除了导致错误的 Cholesky 分解,并将 W 和 B 添加到要返回的列表中。这让我明白了为什么 W
不能进行 Cholesky 分解。
my.gelman.diag <- function(x,
confidence = 0.95,
transform = FALSE,
autoburnin = FALSE,
multivariate = TRUE
){
x <- as.mcmc.list(x)
if (nchain(x) < 2)
stop("You need at least two chains")
if (autoburnin && start(x) < end(x)/2)
x <- window(x, start = end(x)/2 + 1)
Niter <- niter(x)
Nchain <- nchain(x)
Nvar <- nvar(x)
xnames <- varnames(x)
if (transform)
x <- gelman.transform(x)
x <- lapply(x, as.matrix)
S2 <- array(sapply(x, var, simplify = TRUE),
dim = c(Nvar, Nvar, Nchain)
)
W <- apply(S2, c(1, 2), mean)
xbar <- matrix(sapply(x, apply, 2, mean, simplify = TRUE),
nrow = Nvar, ncol = Nchain)
B <- Niter * var(t(xbar))
if (Nvar > 1 && multivariate) { #ph-edits
# CW <- chol(W)
# #This is W^-1*B.
# emax <- eigen(
# backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
# symmetric = TRUE, only.values = TRUE)$values[1]
emax <- 1
mpsrf <- sqrt((1 - 1/Niter) + (1 + 1/Nvar) * emax/Niter)
} else {
mpsrf <- NULL
}
w <- diag(W)
b <- diag(B)
s2 <- matrix(apply(S2, 3, diag), nrow = Nvar, ncol = Nchain)
muhat <- apply(xbar, 1, mean)
var.w <- apply(s2, 1, var)/Nchain
var.b <- (2 * b^2)/(Nchain - 1)
cov.wb <- (Niter/Nchain) * diag(var(t(s2), t(xbar^2)) - 2 *
muhat * var(t(s2), t(xbar)))
V <- (Niter - 1) * w/Niter + (1 + 1/Nchain) * b/Niter
var.V <- ((Niter - 1)^2 * var.w + (1 + 1/Nchain)^2 * var.b +
2 * (Niter - 1) * (1 + 1/Nchain) * cov.wb)/Niter^2
df.V <- (2 * V^2)/var.V
df.adj <- (df.V + 3)/(df.V + 1)
B.df <- Nchain - 1
W.df <- (2 * w^2)/var.w
R2.fixed <- (Niter - 1)/Niter
R2.random <- (1 + 1/Nchain) * (1/Niter) * (b/w)
R2.estimate <- R2.fixed + R2.random
R2.upper <- R2.fixed + qf((1 + confidence)/2, B.df, W.df) *
R2.random
psrf <- cbind(sqrt(df.adj * R2.estimate), sqrt(df.adj * R2.upper))
dimnames(psrf) <- list(xnames, c("Point est.", "Upper C.I."))
out <- list(psrf = psrf, mpsrf = mpsrf, B = B, W = W) #added ph
class(out) <- "gelman.diag"
return( out )
}
将my.gelman.diag()
应用到short.mcmc.list
:
> l <- my.gelman.diag(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
> W <- l$W #Within-sequence variance
> B <- l$B #Between-sequence variance
对 W 的调查表明 W 确实是正定的,但其特征值接近 0,因此它失败了。
> evals.W <- eigen(W, only.values = TRUE)$values
> min(evals.W)
[1] 1.980596e-16
确实,增加公差表明W确实是正定的。
> matrixNormal::is.positive.definite(W, tol = 1e-18)
[1] TRUE
所以实际上,W 接近于线性相关...
> eval <- eigen(solve(W)%*%B, only.values = TRUE)$values[1]
Error in solve.default(W) : system is computationally singular: reciprocal condition number = 7.10718e-21
所以实际上,删除 W 的最后两列使其现在线性 independent/positive 确定。这说明链内有相关参数,可以减少参数个数。
> evals.W[198]
[1] 9.579362e-05
> matrixNormal::is.positive.definite(W[1:198, 1:198])
[1] TRUE
> chol.W <- chol(W)
W 是马尔可夫链的组内方差,衡量状态中每个值与均值的差异程度。如果 W 接近奇异,则方差很小,因此链变化不大。这是一个缓慢移动的链条。用户应使用跟踪图对此进行调查,并可能减少参数数量。链条也可能太短;如果链更长,则链中的值可能足够不同,以至于 W 是线性独立的。
为了避免函数崩溃,我建议使用purrr::possibly()
代替缺失值而不是抛出一个古老的错误。
> pgd <- purrr::possibly(coda::gelman.diag, list(mpsrf = NA), quiet = FALSE)
> pgd(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
Error: the leading minor of order 199 is not positive definite
[1] NA
有关详细信息,请参见 Brooks 和 Gelman,1992 年。
参考: Gelman, A 和 Rubin, DB (1992) Inference from iterative simulation using multiple sequences, 统计科学, 7, 457-511.