在 R 中的双向方差分析中为 Tukey HSD 选择的输出
selected output for Tukey's HSD in two-way ANOVA in R
我有一个包含多个变量的大型数据集。我需要进行双向方差分析,然后使用 Tukey HSD post-hoc 成对多重比较。
我的前 25 个条目的数据头部是这样的:
> head(my_data2, 25 )
CellType variable value
1 Cell1 W1 18.780294
2 Cell1 W1 13.932397
3 Cell1 W1 20.877093
4 Cell1 W1 9.291295
5 Cell1 W1 10.939570
6 Cell1 W1 12.236713
7 Cell1 W1 13.810722
8 Cell1 W1 23.944473
9 Cell1 W1 17.355429
10 Cell1 W1 18.248215
11 Cell2 W1 17.988200
12 Cell2 W1 15.427909
13 Cell2 W1 21.839687
14 Cell2 W1 22.322325
15 Cell2 W1 12.535762
16 Cell2 W1 12.743278
17 Cell2 W1 15.007214
18 Cell2 W1 12.054787
19 Cell2 W1 15.639977
20 Cell2 W1 16.006960
21 Cell3 W1 17.452199
22 Cell3 W1 23.280391
23 Cell3 W1 7.902728
24 Cell3 W1 8.353992
25 Cell3 W1 24.360250
我做方差分析
#ANOVA
my_data2$CellType <- as.factor(my_data2$CellType)
my_ANOVA = aov(value ~ CellType + variable + CellType:variable, data = my_data2)
summary(my_ANOVA)
然后 post hoc
my_posthoc =TukeyHSD(my_ANOVA, which = "CellType:variable")
my_posthoc
到目前为止一切正常,但我的 posthoc 的输出包括所有成对比较,这给出了我们超过 2200 行的大量放置。
例如我的输出是这样的:
> my_posthoc
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = value ~ CellType + variable + CellType:variable, data = my_data2)
$`CellType:variable`
diff lwr upr p adj
Cell2:W1-Cell1:W1 0.21499 -29.46177884 29.8917588 1.0000000
Cell3:W1-Cell1:W1 0.88234 -28.79442884 30.5591088 1.0000000
Cell4:W1-Cell1:W1 1.24301 -28.43375884 30.9197788 1.0000000
Cell5:W1-Cell1:W1 1.61684 -28.05992884 31.2936088 1.0000000
Cell6:W1-Cell1:W1 0.65009 -29.02667884 30.3268588 1.0000000
Cell7:W1-Cell1:W1 1.08223 -28.59453884 30.7589988 1.0000000
Cell1:W2-Cell1:W1 9.00094 -20.67582884 38.6777088 1.0000000
Cell2:W2-Cell1:W1 27.62765 -2.04911884 57.3044188 0.1249342
Cell3:W2-Cell1:W1 29.40077 -0.27599884 59.0775388 0.0570151
Cell4:W2-Cell1:W1 28.84731 -0.82945884 58.5240788 0.0736530
Cell5:W2-Cell1:W1 42.51407 12.83730116 72.1908388 0.0000144
Cell6:W2-Cell1:W1 30.78610 1.10933116 60.4628688 0.0288235
Cell7:W2-Cell1:W1 27.62966 -2.04710884 57.3064288 0.1248307
Cell1:W3-Cell1:W1 20.95847 -8.71829884 50.6352388 0.7816085
Cell2:W3-Cell1:W1 42.50116 12.82439116 72.1779288 0.0000146
Cell3:W3-Cell1:W1 47.07037 17.39360116 76.7471388 0.0000004
Cell4:W3-Cell1:W1 47.26760 17.59083116 76.9443688 0.0000003
Cell5:W3-Cell1:W1 64.08026 34.40349116 93.7570288 0.0000000
Cell6:W3-Cell1:W1 53.90284 24.22607116 83.5796088 0.0000000
最后说:
[ reached getOption("max.print") -- omitted 2290 rows ]
但是我只对每个变量内部的比较感兴趣,而不是它们之间的比较。作为上面输出的例子,我只需要
Cell1:W1-Cell2:W1
。都在同一个变量 w1
中。或者例如 Cell6:W3-Cell1:W3
。我对 Cell6:W3-Cell6:W1
不感兴趣
如何指定?
谢谢
我采用了简单诚实的方式,将term (rowname) 分为四个部分并进行过滤。
library(dplyr); library(tibble); library(purrr) # OR library(tidyverse) # EDITED
my_posthoc2 <- my_posthoc %>%
pluck("CellType:variablen") %>% # get element of list
as_tibble(rownames = "Term") %>% # convert to tibble
separate(Term, # separate terms by - and :
into = c("LL", "LR", "RL", "RR"),
sep = "-|:",
remove = FALSE)
my_posthoc2 %>%
filter(LR == "W1", RR == "W1") # get Cell1:W1-Cell2:W1
由于您指定了 "I'm only interested in comparison within each variable but not between them",因此不需要包含交互项 CellType:variable
您可以将模型重写为:
my_ANOVA = aov(value ~ CellType + variable, data = my_data2)
我有一个包含多个变量的大型数据集。我需要进行双向方差分析,然后使用 Tukey HSD post-hoc 成对多重比较。
我的前 25 个条目的数据头部是这样的:
> head(my_data2, 25 )
CellType variable value
1 Cell1 W1 18.780294
2 Cell1 W1 13.932397
3 Cell1 W1 20.877093
4 Cell1 W1 9.291295
5 Cell1 W1 10.939570
6 Cell1 W1 12.236713
7 Cell1 W1 13.810722
8 Cell1 W1 23.944473
9 Cell1 W1 17.355429
10 Cell1 W1 18.248215
11 Cell2 W1 17.988200
12 Cell2 W1 15.427909
13 Cell2 W1 21.839687
14 Cell2 W1 22.322325
15 Cell2 W1 12.535762
16 Cell2 W1 12.743278
17 Cell2 W1 15.007214
18 Cell2 W1 12.054787
19 Cell2 W1 15.639977
20 Cell2 W1 16.006960
21 Cell3 W1 17.452199
22 Cell3 W1 23.280391
23 Cell3 W1 7.902728
24 Cell3 W1 8.353992
25 Cell3 W1 24.360250
我做方差分析
#ANOVA
my_data2$CellType <- as.factor(my_data2$CellType)
my_ANOVA = aov(value ~ CellType + variable + CellType:variable, data = my_data2)
summary(my_ANOVA)
然后 post hoc
my_posthoc =TukeyHSD(my_ANOVA, which = "CellType:variable")
my_posthoc
到目前为止一切正常,但我的 posthoc 的输出包括所有成对比较,这给出了我们超过 2200 行的大量放置。 例如我的输出是这样的:
> my_posthoc
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = value ~ CellType + variable + CellType:variable, data = my_data2)
$`CellType:variable`
diff lwr upr p adj
Cell2:W1-Cell1:W1 0.21499 -29.46177884 29.8917588 1.0000000
Cell3:W1-Cell1:W1 0.88234 -28.79442884 30.5591088 1.0000000
Cell4:W1-Cell1:W1 1.24301 -28.43375884 30.9197788 1.0000000
Cell5:W1-Cell1:W1 1.61684 -28.05992884 31.2936088 1.0000000
Cell6:W1-Cell1:W1 0.65009 -29.02667884 30.3268588 1.0000000
Cell7:W1-Cell1:W1 1.08223 -28.59453884 30.7589988 1.0000000
Cell1:W2-Cell1:W1 9.00094 -20.67582884 38.6777088 1.0000000
Cell2:W2-Cell1:W1 27.62765 -2.04911884 57.3044188 0.1249342
Cell3:W2-Cell1:W1 29.40077 -0.27599884 59.0775388 0.0570151
Cell4:W2-Cell1:W1 28.84731 -0.82945884 58.5240788 0.0736530
Cell5:W2-Cell1:W1 42.51407 12.83730116 72.1908388 0.0000144
Cell6:W2-Cell1:W1 30.78610 1.10933116 60.4628688 0.0288235
Cell7:W2-Cell1:W1 27.62966 -2.04710884 57.3064288 0.1248307
Cell1:W3-Cell1:W1 20.95847 -8.71829884 50.6352388 0.7816085
Cell2:W3-Cell1:W1 42.50116 12.82439116 72.1779288 0.0000146
Cell3:W3-Cell1:W1 47.07037 17.39360116 76.7471388 0.0000004
Cell4:W3-Cell1:W1 47.26760 17.59083116 76.9443688 0.0000003
Cell5:W3-Cell1:W1 64.08026 34.40349116 93.7570288 0.0000000
Cell6:W3-Cell1:W1 53.90284 24.22607116 83.5796088 0.0000000
最后说:
[ reached getOption("max.print") -- omitted 2290 rows ]
但是我只对每个变量内部的比较感兴趣,而不是它们之间的比较。作为上面输出的例子,我只需要
Cell1:W1-Cell2:W1
。都在同一个变量 w1
中。或者例如 Cell6:W3-Cell1:W3
。我对 Cell6:W3-Cell6:W1
如何指定? 谢谢
我采用了简单诚实的方式,将term (rowname) 分为四个部分并进行过滤。
library(dplyr); library(tibble); library(purrr) # OR library(tidyverse) # EDITED
my_posthoc2 <- my_posthoc %>%
pluck("CellType:variablen") %>% # get element of list
as_tibble(rownames = "Term") %>% # convert to tibble
separate(Term, # separate terms by - and :
into = c("LL", "LR", "RL", "RR"),
sep = "-|:",
remove = FALSE)
my_posthoc2 %>%
filter(LR == "W1", RR == "W1") # get Cell1:W1-Cell2:W1
由于您指定了 "I'm only interested in comparison within each variable but not between them",因此不需要包含交互项 CellType:variable
您可以将模型重写为:
my_ANOVA = aov(value ~ CellType + variable, data = my_data2)