for循环在r列表中的所有数据框中进行方差分析测试

for loop to conduct anova test in all dataframes in a list in r

我有我的数据框:

df <- read.table(text = "id G1  G2  G3  value
1   A   D20 TAN 1
2   A   D20 TAN 9
3   A   D20 TAN 10
4   A   D40 TAN 8
5   A   D40 TAN 3
6   A   D40 TAN 9
7   A   D60 TAN 5
8   A   D60 TAN 5
9   A   D60 TAN 10
10  B   D20 TAN 7
11  B   D20 TAN 8
12  B   D20 TAN 10
13  B   D40 TAN 8
14  B   D40 TAN 3
15  B   D40 TAN 7
16  B   D60 TAN 1
17  B   D60 TAN 10
18  B   D60 TAN 1
19  C   D20 TAN 5
20  C   D20 TAN 9
21  C   D20 TAN 4
22  C   D40 TAN 6
23  C   D40 TAN 3
24  C   D40 TAN 8
25  C   D60 TAN 9
26  C   D60 TAN 10
27  C   D60 TAN 4
28  A   D20 BBC 9
29  A   D20 BBC 3
30  A   D20 BBC 7
31  A   D40 BBC 10
32  A   D40 BBC 7
33  A   D40 BBC 4
34  A   D60 BBC 2
35  A   D60 BBC 3
36  A   D60 BBC 8
37  B   D20 BBC 8
38  B   D20 BBC 1
39  B   D20 BBC 5
40  B   D40 BBC 6
41  B   D40 BBC 2
42  B   D40 BBC 6
43  B   D60 BBC 9
44  B   D60 BBC 2
45  B   D60 BBC 10
46  C   D20 BBC 3
47  C   D20 BBC 1
48  C   D20 BBC 4
49  C   D40 BBC 10
50  C   D40 BBC 8
51  C   D40 BBC 3
52  C   D60 BBC 5
53  C   D60 BBC 3
54  C   D60 BBC 1",stringsAsFactors = FALSE, header = TRUE)

我通过以下方式创建了一个附加专栏:

df$Group<-paste(df$G2,df$G3)

然后我将 df 分成一个列表 Group:

L1<-split(df,df$Group)

现在我想对 L1 中的每个 table 进行方差分析检验和 Tukey 检验 例如:

a1<-aov(L1$`D20 BBC`$value~L1$`D20 BBC`$G1)
TukeyHSD(a1)

但只有一个table。如何使用 for 循环对 L1 中的所有 table 执行 aov 函数,然后对所有 TukeyHSD 执行 TukeyHSD 函数=18=] 结果?

您可以在 lapply 中完成此操作。

lapply(L1, function(x) with(x, TukeyHSD(aov(value ~ G1))))

实际上有一个函数 by 将函数应用于拆分数据框,因此您可以这样做:

by(df, df$Group, function(x) with(x, TukeyHSD(aov(value ~ G1))))
# diff        lwr      upr     p adj
# B-A -1.666667  -8.752543 5.419210 0.7604243
# C-A -3.666667 -10.752543 3.419210 0.3205994
# C-B -2.000000  -9.085876 5.085876 0.6792890
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A  1.6666667  -6.725769 10.059102 0.8205065
# C-A -0.6666667  -9.059102  7.725769 0.9679553
# C-B -2.3333333 -10.725769  6.059102 0.6866510
# -------------------------------------------------------------------------------- 
#   diff       lwr      upr     p adj
# B-A -2.333333e+00 -9.895291 5.228624 0.6334637
# C-A  1.776357e-15 -7.561958 7.561958 1.0000000
# C-B  2.333333e+00 -5.228624 9.895291 0.6334637
# -------------------------------------------------------------------------------- 
#   diff       lwr      upr     p adj
# B-A -0.6666667 -7.703163 6.369830 0.9548296
# C-A -1.0000000 -8.036497 6.036497 0.9021379
# C-B -0.3333333 -7.369830 6.703163 0.9884428
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A  2.666667  -5.684119 11.017452 0.6148213
# C-A -1.333333  -9.684119  7.017452 0.8786205
# C-B -4.000000 -12.350785  4.350785 0.3681421
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A -2.666667 -12.441010  7.107677 0.6957155
# C-A  1.000000  -8.774344 10.774344 0.9475956
# C-B  3.666667  -6.107677 13.441010 0.5210071

tidyverse 方法可以是:

df %>%
 group_split(Group, keep = FALSE) %>%
 map(~ TukeyHSD(aov(value ~ G1, data = .)))

[[1]]
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = value ~ G1, data = .)

$G1
         diff        lwr      upr     p adj
B-A -1.666667  -8.752543 5.419210 0.7604243
C-A -3.666667 -10.752543 3.419210 0.3205994
C-B -2.000000  -9.085876 5.085876 0.6792890

加上broomtidy()

df %>%
 group_split(Group, keep = FALSE) %>%
 map(~ TukeyHSD(aov(value ~ G1, data = .))) %>%
 map(tidy)

[[1]]
# A tibble: 3 x 6
  term  comparison estimate conf.low conf.high adj.p.value
  <chr> <chr>         <dbl>    <dbl>     <dbl>       <dbl>
1 G1    B-A           -1.67    -8.75      5.42       0.760
2 G1    C-A           -3.67   -10.8       3.42       0.321
3 G1    C-B           -2.00    -9.09      5.09       0.679