Bootstrap 在 R 中使用 purrr::rerun() 的平均值的标准误差

Question

我正在尝试使用 R 中的 purrr::rerun() 函数 bootstrap 均值的标准误差。例如，在这里我试图找到 Sepal.Length变量

sample_the_mean <- function(x) {
    the_sample <- sample(x, replace = TRUE)
    mean(the_sample)
}

sample_the_mean(iris$Sepal.Length)

#> [1] 5.894667

使用一次似乎效果很好。这是 purrr::rerun();我将只显示输出的第一个列表元素（但列表每次迭代都有一个元素，所以总共有 10 个元素）：

    out <- purrr::rerun(10, sample_the_mean, x = iris$Sepal.Length)

    out[[1]]

#> [[1]]
#> function (x) 
#> {
#>     the_sample <- sample(x, replace = TRUE)
#>     mean(the_sample)
#> }
#> 
#> $x
#>   [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4
#>  [18] 5.1 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5
#>  [35] 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0
#>  [52] 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8
#>  [69] 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4
#>  [86] 6.0 6.7 6.3 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8
#> [103] 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7
#> [120] 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7
#> [137] 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9

如您所见，返回的不是平均值，而是样本本身。关于为什么会这样的任何想法？我该怎么做呢？我宁愿不使用包（在这种特殊情况下 purrr 除外）。

Answer 1

以下作品：

set.seed(2018)
purrr::rerun(10, sample_the_mean(iris$Sepal.Length))
 #[[1]]
#[1] 5.73
#
#[[2]]
#[1] 5.810667
#
#[[3]]
#[1] 5.868667
#
#[[4]]
#[1] 5.902
#
#[[5]]
#[1] 5.844
#
#[[6]]
#[1] 5.746667
#
#[[7]]
#[1] 5.877333
#
#[[8]]
#[1] 5.853333
#
#[[9]]
#[1] 5.821333
#
#[[10]]
#[1] 5.768

从?rerun可以看出，...指的是要重新运行的表达式。因此，在您的情况下，您需要将单个表达式指定为 sample_the_mean(iris$Sepal.Length)，它将被捕获为一个 quosure，然后进行评估。也许在 R 终端中输入 rerun 以查看引擎盖下发生了什么。

Bootstrap 在 R 中使用 purrr::rerun() 的平均值的标准误差

Bootstrap the standard error of the mean using purrr::rerun() in R

r

statistics-bootstrap

purrr