R 中 bootstrap 的 T 检验
T-test with bootstrap in R
我正在尝试 运行 在 R 中使用 bootstrap 进行 t 检验。
我有 50 名参与者的样本,其中 39 名是女性。我有一个因变量 d',想看看男性和女性在这个变量上是否不同。因为我只有 11 名男性参与者,所以我想使用 bootstrapped t 检验(不是最好的主意,但我在文献中看到过)。
我有一个名为“数据”的数据库,其中包含多个变量。所以,首先我提取了两个向量:
dPrimeFemales <- subset(data, Gender == "F",
select=c(dPrime))
dPrimeMales <- subset(data, Gender == "M",
select=c(dPrime))
然后,我尝试了一些在互联网上(和这里)找到的东西。
基于此 post 我试过:
set.seed(1315)
B <- 1000
t.vect <- vector(length=B)
p.vect <- vector(length=B)
for(i in 1:B){
boot.c <- sample(dPrimeFemales, size=nrow(dPrimeFemales), replace=T)
boot.p <- sample(dPrimeMales, size=nrow(dPrimeMales), replace=T)
ttest <- t.test(boot.c, boot.p)
t.vect[i] <- ttest$statistic
p.vect[i] <- ttest$p.value
}
但是它说:
Error: Must use a vector in `[`, not an object of class matrix.
Call `rlang::last_error()` to see a backtrace
我也试过这个:
boot.t.test: Bootstrap t-test
首先,我无法加载函数。所以,我复制粘贴并 运行 这个:
然后我运行这个:
boot.t.test(x = dPrimeFemales, y = dPrimeMales)
但是,它是这样说的:
Error in boot.t.test(x = dPrimeFemales, y = dPrimeMales) :
dims [product 1] do not match the length of object [1000]
In addition: There were 50 or more warnings (use warnings() to see the first 50)
如果我使用 warnings()
它说:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
3: In mean.default(c(x, y)) : argument is not numeric or logical: returning NA
4: In mean.default(x) : argument is not numeric or logical: returning NA
5: In mean.default(y) : argument is not numeric or logical: returning NA
等等...
更清楚地说,我正在考虑类似 SPSS 中的 bootstrapped t 检验,如下所示:
我认为这会容易得多。
欢迎任何帮助
谢谢大家的宝贵时间。
structure(list(dPrime = c(0.60805224661517, 0.430727299295457,
-0.177380196159658, 0.771422126383253, 0.598621304083563, 0,
0.167894004788105, -0.336998837042929, 0.0842422708809764, -0.440748778800912,
0.644261556974516, -0.167303467814258, 0.169695369228671, -0.251545738695235,
0.0842422708809764, -0.0985252105020469, -0.239508275220057,
-0.143350050535084, 0.430727299295457, 0.757969499665785, -0.282230896122292,
-0.271053409572241, -0.090032472207662, -0.090032472207662, 0.524400512708041,
-0.218695510362827, -0.271053409572241, 1.07035864674857, 0.262833294507352,
0.421241107923905, -0.0836517339071291, 0.090032472207662, -0.598621304083563,
-0.356506507919935, 0.474566187745845, 0.336998837042929, 1.35083901409173,
-0.336998837042929, -0.443021053393661, 0.757969499665785, -0.841621233572914,
0.167303467814258, 0.167894004788105, 0.090032472207662, -0.177380196159658,
0.251545738695235, -0.344495842891614, -0.17280082229969, -0.440748778800912,
0), Gender = c("F", "F", "F", "F", "F", "F", "F", "F", "M", "M",
"F", "F", "F", "F", "F", "F", "F", "F", "M", "F", "M", "M", "F",
"F", "F", "F", "F", "F", "F", "F", "M", "F", "F", "F", "M", "F",
"F", "F", "F", "M", "M", "F", "F", "M", "M", "F", "F", "F", "F",
"F")), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"
))
下面是一个将该函数与模拟数据一起使用的示例,其中您希望 p 值接近 1。无需事先对其进行子集化并创建中间对象。
set.seed(0)
df <- data.frame(gender = sample(c('M', 'F'), size=50, replace=T),
measure = runif(n=50))
boot.t.test(df[df$gender=='M', 'measure'], df[df$gender=='F', 'measure'], reps=1000)
Bootstrap Two Sample t-test
t = -0.186, p-value = 0.859
Alternative hypothesis: true difference in means is not equal to 0
$mu0
[1] 0
$statistic
[1] -0.1863362
$alternative
[1] "two.sided"
$p.value
[1] 0.859
我正在尝试 运行 在 R 中使用 bootstrap 进行 t 检验。 我有 50 名参与者的样本,其中 39 名是女性。我有一个因变量 d',想看看男性和女性在这个变量上是否不同。因为我只有 11 名男性参与者,所以我想使用 bootstrapped t 检验(不是最好的主意,但我在文献中看到过)。
我有一个名为“数据”的数据库,其中包含多个变量。所以,首先我提取了两个向量:
dPrimeFemales <- subset(data, Gender == "F",
select=c(dPrime))
dPrimeMales <- subset(data, Gender == "M",
select=c(dPrime))
然后,我尝试了一些在互联网上(和这里)找到的东西。 基于此 post 我试过:
set.seed(1315)
B <- 1000
t.vect <- vector(length=B)
p.vect <- vector(length=B)
for(i in 1:B){
boot.c <- sample(dPrimeFemales, size=nrow(dPrimeFemales), replace=T)
boot.p <- sample(dPrimeMales, size=nrow(dPrimeMales), replace=T)
ttest <- t.test(boot.c, boot.p)
t.vect[i] <- ttest$statistic
p.vect[i] <- ttest$p.value
}
但是它说:
Error: Must use a vector in `[`, not an object of class matrix.
Call `rlang::last_error()` to see a backtrace
我也试过这个: boot.t.test: Bootstrap t-test
首先,我无法加载函数。所以,我复制粘贴并 运行 这个:
然后我运行这个:
boot.t.test(x = dPrimeFemales, y = dPrimeMales)
但是,它是这样说的:
Error in boot.t.test(x = dPrimeFemales, y = dPrimeMales) :
dims [product 1] do not match the length of object [1000]
In addition: There were 50 or more warnings (use warnings() to see the first 50)
如果我使用 warnings()
它说:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
3: In mean.default(c(x, y)) : argument is not numeric or logical: returning NA
4: In mean.default(x) : argument is not numeric or logical: returning NA
5: In mean.default(y) : argument is not numeric or logical: returning NA
等等...
更清楚地说,我正在考虑类似 SPSS 中的 bootstrapped t 检验,如下所示:
我认为这会容易得多。 欢迎任何帮助
谢谢大家的宝贵时间。
structure(list(dPrime = c(0.60805224661517, 0.430727299295457,
-0.177380196159658, 0.771422126383253, 0.598621304083563, 0,
0.167894004788105, -0.336998837042929, 0.0842422708809764, -0.440748778800912,
0.644261556974516, -0.167303467814258, 0.169695369228671, -0.251545738695235,
0.0842422708809764, -0.0985252105020469, -0.239508275220057,
-0.143350050535084, 0.430727299295457, 0.757969499665785, -0.282230896122292,
-0.271053409572241, -0.090032472207662, -0.090032472207662, 0.524400512708041,
-0.218695510362827, -0.271053409572241, 1.07035864674857, 0.262833294507352,
0.421241107923905, -0.0836517339071291, 0.090032472207662, -0.598621304083563,
-0.356506507919935, 0.474566187745845, 0.336998837042929, 1.35083901409173,
-0.336998837042929, -0.443021053393661, 0.757969499665785, -0.841621233572914,
0.167303467814258, 0.167894004788105, 0.090032472207662, -0.177380196159658,
0.251545738695235, -0.344495842891614, -0.17280082229969, -0.440748778800912,
0), Gender = c("F", "F", "F", "F", "F", "F", "F", "F", "M", "M",
"F", "F", "F", "F", "F", "F", "F", "F", "M", "F", "M", "M", "F",
"F", "F", "F", "F", "F", "F", "F", "M", "F", "F", "F", "M", "F",
"F", "F", "F", "M", "M", "F", "F", "M", "M", "F", "F", "F", "F",
"F")), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"
))
下面是一个将该函数与模拟数据一起使用的示例,其中您希望 p 值接近 1。无需事先对其进行子集化并创建中间对象。
set.seed(0)
df <- data.frame(gender = sample(c('M', 'F'), size=50, replace=T),
measure = runif(n=50))
boot.t.test(df[df$gender=='M', 'measure'], df[df$gender=='F', 'measure'], reps=1000)
Bootstrap Two Sample t-test
t = -0.186, p-value = 0.859
Alternative hypothesis: true difference in means is not equal to 0
$mu0
[1] 0
$statistic
[1] -0.1863362
$alternative
[1] "two.sided"
$p.value
[1] 0.859