如何将统计测试应用于 R 中数据框的多列
How to apply a statistical test to several columns of a dataframe in R
我想应用此测试,不仅要应用到列 x1
,就像我在本例中所做的那样,还要应用到 df
的多个列。在这种情况下 x1
和 x2
.
我试图将此代码放入一个函数中并使用 purrr::map
但我做不对。
library(tidyverse)
df <- tibble(skul = c(rep('a',60), rep('b', 64)),
x1 = sample(1:10, 124, replace = TRUE),
x2 = sample(1:10, 124, replace = TRUE),
i_f = c(rep(0, 30), rep(1, 30), rep(0, 32), rep(1, 32)))
lapply(split(df, factor(df$skul)),
function(x)wilcox.test(data=x, x1 ~ i_f,
paired=FALSE))
#> Warning in wilcox.test.default(x = c(10L, 5L, 8L, 4L, 6L, 3L, 10L, 2L, 10L, :
#> cannot compute exact p-value with ties
#> Warning in wilcox.test.default(x = c(3L, 3L, 4L, 9L, 8L, 10L, 5L, 5L, 4L, :
#> cannot compute exact p-value with ties
#> $a
#>
#> Wilcoxon rank sum test with continuity correction
#>
#> data: x1 by i_f
#> W = 546, p-value = 0.1554
#> alternative hypothesis: true location shift is not equal to 0
#>
#>
#> $b
#>
#> Wilcoxon rank sum test with continuity correction
#>
#> data: x1 by i_f
#> W = 565, p-value = 0.4781
#> alternative hypothesis: true location shift is not equal to 0
Created on 2022-04-13 by the reprex package (v2.0.1)
一种方法是在 split
之后将感兴趣的列作为嵌套内循环进行循环,使用 reformulate
创建公式并应用 wilcox.test
out <- lapply(split(df, df$skul), function(x)
lapply(setNames(c("x1", "x2"), c("x1", "x2")), function(y)
wilcox.test(reformulate("i_f", response = y), data = x)))
-输出
> out$a
$x1
Wilcoxon rank sum test with continuity correction
data: x1 by i_f
W = 452, p-value = 0.9822
alternative hypothesis: true location shift is not equal to 0
$x2
Wilcoxon rank sum test with continuity correction
data: x2 by i_f
W = 404.5, p-value = 0.5027
alternative hypothesis: true location shift is not equal to 0
如果我们想使用tidyverse
library(dplyr)
df %>%
group_by(skul) %>%
summarise(across(c(x1, x2),
~list(broom::tidy(wilcox.test(reformulate("i_f", cur_column()))))))
我想应用此测试,不仅要应用到列 x1
,就像我在本例中所做的那样,还要应用到 df
的多个列。在这种情况下 x1
和 x2
.
我试图将此代码放入一个函数中并使用 purrr::map
但我做不对。
library(tidyverse)
df <- tibble(skul = c(rep('a',60), rep('b', 64)),
x1 = sample(1:10, 124, replace = TRUE),
x2 = sample(1:10, 124, replace = TRUE),
i_f = c(rep(0, 30), rep(1, 30), rep(0, 32), rep(1, 32)))
lapply(split(df, factor(df$skul)),
function(x)wilcox.test(data=x, x1 ~ i_f,
paired=FALSE))
#> Warning in wilcox.test.default(x = c(10L, 5L, 8L, 4L, 6L, 3L, 10L, 2L, 10L, :
#> cannot compute exact p-value with ties
#> Warning in wilcox.test.default(x = c(3L, 3L, 4L, 9L, 8L, 10L, 5L, 5L, 4L, :
#> cannot compute exact p-value with ties
#> $a
#>
#> Wilcoxon rank sum test with continuity correction
#>
#> data: x1 by i_f
#> W = 546, p-value = 0.1554
#> alternative hypothesis: true location shift is not equal to 0
#>
#>
#> $b
#>
#> Wilcoxon rank sum test with continuity correction
#>
#> data: x1 by i_f
#> W = 565, p-value = 0.4781
#> alternative hypothesis: true location shift is not equal to 0
Created on 2022-04-13 by the reprex package (v2.0.1)
一种方法是在 split
之后将感兴趣的列作为嵌套内循环进行循环,使用 reformulate
创建公式并应用 wilcox.test
out <- lapply(split(df, df$skul), function(x)
lapply(setNames(c("x1", "x2"), c("x1", "x2")), function(y)
wilcox.test(reformulate("i_f", response = y), data = x)))
-输出
> out$a
$x1
Wilcoxon rank sum test with continuity correction
data: x1 by i_f
W = 452, p-value = 0.9822
alternative hypothesis: true location shift is not equal to 0
$x2
Wilcoxon rank sum test with continuity correction
data: x2 by i_f
W = 404.5, p-value = 0.5027
alternative hypothesis: true location shift is not equal to 0
如果我们想使用tidyverse
library(dplyr)
df %>%
group_by(skul) %>%
summarise(across(c(x1, x2),
~list(broom::tidy(wilcox.test(reformulate("i_f", cur_column()))))))