来自分层 bootstrap 的置信区间

Confidence Interval from hierarchical bootstrap

我想使用 boot.ci() 计算多阶段 bootstrap 的 BCa 置信区间。这是来自的示例:Non-parametric bootstrapping on the highest level of clustered data using boot() function from {boot} in R 它使用 boot 命令。

# creating example df
rho <- 0.4
dat <- expand.grid(
  trial=factor(1:5),
  subject=factor(1:3)
)
sig <- rho * tcrossprod(model.matrix(~ 0 + subject, dat))
diag(sig) <- 1
set.seed(17); dat$value <- chol(sig) %*% rnorm(15, 0, 1)

# function for resampling
resamp.mean <- function(dat, 
                    indices, 
                    cluster = c('subject', 'trial'), 
                    replace = TRUE){
  cls <- sample(unique(dat[[cluster[1]]]), replace=replace)
  sub <- lapply(cls, function(b) subset(dat, dat[[cluster[1]]]==b))
  sub <- do.call(rbind, sub)
  mean(sub$value)
} 

dat.boot <- boot(dat, resamp.mean, 4) # produces and estimated statistic

boot.ci(data.boot) # produces errors

如何在 boot 输出上使用 boot.ci

您使用的 bootstrap 重采样太少。当你调用boot.ci时,需要影响措施,如果没有提供它们是从empinf获得的,这可能会因观察太少而失败。请参阅 here 以了解类似的解释。

尝试

dat.boot <- boot(dat, resamp.mean, 1000) 
boot.ci(dat.boot, type = "bca") 

给出:

> boot.ci(dat.boot, type = "bca") 

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = dat.boot, type = "bca")

Intervals : 
Level       BCa          
95%   (-0.2894,  1.2979 )  
Calculations and Intervals on Original Scale
Some BCa intervals may be unstable

作为替代方案,您可以自己提供 L(影响度量)。

# proof of concept, use appropriate value for L!
> dat.boot <- boot(dat, resamp.mean, 4)
> boot.ci(dat.boot, type = "bca", L = 0.2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 4 bootstrap replicates

CALL : 
boot.ci(boot.out = dat.boot, type = "bca", L = 0.2)

Intervals : 
Level       BCa          
95%   ( 0.1322,  1.2979 )  
Calculations and Intervals on Original Scale
Warning : BCa Intervals used Extreme Quantiles
Some BCa intervals may be unstable