如何在跨突变后将 t.test() 应用于多对列
How to apply t.test() to multiple pairs of columns after mutate across
这个问题与有关。
数据:
df <- structure(list(Subject = 1:3, PreScoreTestA = c(30L, 15L, 20L
), PostScoreTestA = c(40L, 12L, 22L), PreScoreTestB = c(6L, 9L,
11L), PostScoreTestB = c(8L, 13L, 12L), PreScoreTestC = c(12L,
7L, 9L), PostScoreTestC = c(10L, 7L, 10L)), class = "data.frame", row.names = c(NA,
-3L))
> df
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC
1 1 30 40 6 8 12 10
2 2 15 12 9 13 7 7
3 3 20 22 11 12 9 10
此处 OP 询问是否可以将 t.test
应用于 宽格式数据帧 中的成对列。已经提供了使用长格式的解决方案。
不过,我尝试应用以下代码作为以宽格式执行 t.test 的答案。
我的代码使用 +
作为函数(运行良好):
library(dplyr)
library(stringr)
df %>%
mutate(across(starts_with('PreScore'), ~ . +
get(str_replace(cur_column(), "^PreScore", "PostScore")), .names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
# gives:
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC
1 1 30 40 6 8 12 10
2 2 15 12 9 13 7 7
3 3 20 22 11 12 9 10
TestA_TTest TestB_TTest TestC_TTest
1 70 14 22
2 27 22 14
3 42 23 19
现在我想通过 t.test
更改函数 +
(这不起作用,我尝试了很多变体):
library(dplyr)
library(stringr)
df %>%
mutate(across(starts_with('PreScore'), ~ . t.test
get(str_replace(cur_column(), "^PreScore", "PostScore")), .names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
我想知道:
是否可以将 t.test
函数应用于 across
之后的预定义列对集,就像 -
+
/
等一样。 ..
我浏览过的更多资源:
dplyr summarise multiple columns using t.test
t.test
输出是一个 list
,因此我们可能需要包装在一个 list
中以便用 mutate
进行容器化
library(dplyr)
library(stringr)
out <- df %>%
mutate(across(starts_with('PreScore'),
~list(t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))),
.names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
-检查 str
> str(out)
'data.frame': 3 obs. of 10 variables:
$ Subject : int 1 2 3
$ PreScoreTestA : int 30 15 20
$ PostScoreTestA: int 40 12 22
$ PreScoreTestB : int 6 9 11
$ PostScoreTestB: int 8 13 12
$ PreScoreTestC : int 12 7 9
$ PostScoreTestC: int 10 7 10
$ TestA_TTest :List of 3
..$ :List of 10
.. ..$ statistic : Named num -0.322
.. .. ..- attr(*, "names")= chr "t"
.. ..$ parameter : Named num 3.07
.. .. ..- attr(*, "names")= chr "df"
.. ..$ p.value : num 0.768
.. ..$ conf.int : num -32.2 26.2
.. .. ..- attr(*, "conf.level")= num 0.95
.. ..$ estimate : Named num 21.7 24.7
.. .. ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
.. ..$ null.value : Named num 0
.. .. ..- attr(*, "names")= chr "difference in means"
.. ..$ stderr : num 9.3
.. ..$ alternative: chr "two.sided"
.. ..$ method : chr "Welch Two Sample t-test"
.. ..$ data.name : chr "PreScoreTestA and get(str_replace(cur_column(), \"^PreScore\", \"PostScore\"))"
.. ..- attr(*, "class")= chr "htest"
..$ :List of 10
...
如果我们只需要提取特定的 list
元素,即 p.value
df %>%
mutate(across(starts_with('PreScore'),
~ t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))$p.value,
.names = "{.col}_TTest"))
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC PreScoreTestA_TTest
1 1 30 40 6 8 12 10 0.767827
2 2 15 12 9 13 7 7 0.767827
3 3 20 22 11 12 9 10 0.767827
PreScoreTestB_TTest PreScoreTestC_TTest
1 0.330604 0.8604162
2 0.330604 0.8604162
3 0.330604 0.8604162
请注意,通过使用 mutate
,我们为所有行存储了相同的信息。相反,我们可以使用 summarise
df %>%
summarise(across(starts_with('PreScore'), ~ t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))$p.value,
.names = "{.col}_TTest"))
PreScoreTestA_TTest PreScoreTestB_TTest PreScoreTestC_TTest
1 0.767827 0.330604 0.8604162
这个问题与
数据:
df <- structure(list(Subject = 1:3, PreScoreTestA = c(30L, 15L, 20L
), PostScoreTestA = c(40L, 12L, 22L), PreScoreTestB = c(6L, 9L,
11L), PostScoreTestB = c(8L, 13L, 12L), PreScoreTestC = c(12L,
7L, 9L), PostScoreTestC = c(10L, 7L, 10L)), class = "data.frame", row.names = c(NA,
-3L))
> df
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC
1 1 30 40 6 8 12 10
2 2 15 12 9 13 7 7
3 3 20 22 11 12 9 10
此处 OP 询问是否可以将 t.test
应用于 宽格式数据帧 中的成对列。已经提供了使用长格式的解决方案。
不过,我尝试应用以下代码作为以宽格式执行 t.test 的答案。
我的代码使用 +
作为函数(运行良好):
library(dplyr)
library(stringr)
df %>%
mutate(across(starts_with('PreScore'), ~ . +
get(str_replace(cur_column(), "^PreScore", "PostScore")), .names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
# gives:
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC
1 1 30 40 6 8 12 10
2 2 15 12 9 13 7 7
3 3 20 22 11 12 9 10
TestA_TTest TestB_TTest TestC_TTest
1 70 14 22
2 27 22 14
3 42 23 19
现在我想通过 t.test
更改函数 +
(这不起作用,我尝试了很多变体):
library(dplyr)
library(stringr)
df %>%
mutate(across(starts_with('PreScore'), ~ . t.test
get(str_replace(cur_column(), "^PreScore", "PostScore")), .names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
我想知道:
是否可以将 t.test
函数应用于 across
之后的预定义列对集,就像 -
+
/
等一样。 ..
我浏览过的更多资源:
dplyr summarise multiple columns using t.test
t.test
输出是一个 list
,因此我们可能需要包装在一个 list
中以便用 mutate
library(dplyr)
library(stringr)
out <- df %>%
mutate(across(starts_with('PreScore'),
~list(t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))),
.names = "{.col}_TTest")) %>%
rename_at(vars(ends_with('TTest')), ~ str_remove(., "PreScore"))
-检查 str
> str(out)
'data.frame': 3 obs. of 10 variables:
$ Subject : int 1 2 3
$ PreScoreTestA : int 30 15 20
$ PostScoreTestA: int 40 12 22
$ PreScoreTestB : int 6 9 11
$ PostScoreTestB: int 8 13 12
$ PreScoreTestC : int 12 7 9
$ PostScoreTestC: int 10 7 10
$ TestA_TTest :List of 3
..$ :List of 10
.. ..$ statistic : Named num -0.322
.. .. ..- attr(*, "names")= chr "t"
.. ..$ parameter : Named num 3.07
.. .. ..- attr(*, "names")= chr "df"
.. ..$ p.value : num 0.768
.. ..$ conf.int : num -32.2 26.2
.. .. ..- attr(*, "conf.level")= num 0.95
.. ..$ estimate : Named num 21.7 24.7
.. .. ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
.. ..$ null.value : Named num 0
.. .. ..- attr(*, "names")= chr "difference in means"
.. ..$ stderr : num 9.3
.. ..$ alternative: chr "two.sided"
.. ..$ method : chr "Welch Two Sample t-test"
.. ..$ data.name : chr "PreScoreTestA and get(str_replace(cur_column(), \"^PreScore\", \"PostScore\"))"
.. ..- attr(*, "class")= chr "htest"
..$ :List of 10
...
如果我们只需要提取特定的 list
元素,即 p.value
df %>%
mutate(across(starts_with('PreScore'),
~ t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))$p.value,
.names = "{.col}_TTest"))
Subject PreScoreTestA PostScoreTestA PreScoreTestB PostScoreTestB PreScoreTestC PostScoreTestC PreScoreTestA_TTest
1 1 30 40 6 8 12 10 0.767827
2 2 15 12 9 13 7 7 0.767827
3 3 20 22 11 12 9 10 0.767827
PreScoreTestB_TTest PreScoreTestC_TTest
1 0.330604 0.8604162
2 0.330604 0.8604162
3 0.330604 0.8604162
请注意,通过使用 mutate
,我们为所有行存储了相同的信息。相反,我们可以使用 summarise
df %>%
summarise(across(starts_with('PreScore'), ~ t.test(.,
get(str_replace(cur_column(), "^PreScore", "PostScore")))$p.value,
.names = "{.col}_TTest"))
PreScoreTestA_TTest PreScoreTestB_TTest PreScoreTestC_TTest
1 0.767827 0.330604 0.8604162