R 简单 Bootstrap

Question

我有一个包含两列的数据框（应用程序）

Customer    Application
1           1
1           0
1           0
1           1
1           1
1           0
1           1
1           0
1           0
1           1
1           1

申请率是

sum(Applications$Application)/sum(Applications$Customer).

我被要求 bootstrap 通过对 1000 名客户进行运行 1000 个样本以获得应用率的分布和置信度来 bootstrap 这个应用率。我尝试使用 boot 包如下

f2 <- function(Loan,Customer){sum(Applications$Application)/sum(Applications$Customer)}
bootapp1 <-(boot(Applications, f2, 1000))
bootapp1

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Bootstrap_Test, statistic = f2, R = 1000)


Bootstrap Statistics :
       original  bias    std. error
t1* 0.003052608       0           0

显然这不是我要找的，因为它不会给出任何偏差或标准错误。

谁能告诉我快速获得所需结果的方法。我想一定有一种非常简单的方法可以做到这一点。

Answer 1

您只需调整需要两个参数的函数。来自 boot 上的帮助文件，在参数 statistic:

下

A function which when applied to data returns a vector containing the statistic(s) of interest. When sim = "parametric", the first argument to statistic must be the data. For each replicate a simulated dataset returned by ran.gen will be passed. In all other cases statistic must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample.

library(boot)
x <- structure(list(Customer = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                 1L, 1L), Application = c(1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 
                                                          1L, 1L)), .Names = c("Customer", "Application"), class = "data.frame", row.names = c(NA, 
                                                                                                                                               -11L))
f2 <- function(x, index){sum(x[index, "Application"])/sum(x[index, "Customer"])}
bootapp1 <- boot(data = x, statistic = f2, R = 1000)
> bootapp1

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
  boot(data = x, statistic = f2, R = 1000)


Bootstrap Statistics :
  original       bias    std. error
t1* 0.5454545 0.0005454545     0.14995

R 简单 Bootstrap

R simple Bootstrap

r

statistics-bootstrap