R: 运行 多个 post 一次性测试,使用 emmeans 包
R: Run multiple post hoc tests at once, using emmeans package
我正在处理一个数据集,其中包含几种不同类型的蛋白质作为列。它有点像这样 This is simplified, the original dataset contains over 100 types of proteins。我想看看在考虑随机效应 (=id) 时,蛋白质的浓度是否因处理而不同。我设法一次 运行 多次重复方差分析。但我也想根据治疗对所有蛋白质进行成对比较。我想到的第一件事是使用 emmeans 包,但我在编码时遇到了麻烦。
#install packages
library(tidyverse)
library(emmeans)
#Create a data set
set.seed(1)
id <- rep(c("1","2","3","4","5","6"),3)
Treatment <- c(rep(c("A"), 6), rep(c("B"), 6),rep(c("C"), 6))
Protein1 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein2 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein3 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
DF <- data.frame(id, Treatment, Protein1, Protein2, Protein3) %>%
mutate(id = factor(id),
Treatment = factor(Treatment, levels = c("A","B","C")))
#First, I tried to run multiple anova, by using lapply
responseList <- names(DF)[c(3:5)]
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
aov(mF, data = DF)
})
lapply(modelList, summary)
#Pairwise comparison using emmeans. This did not work
wt_emm <- emmeans(modelList, "Treatment")
> wt_emm <- emmeans(modelList, "Treatment")
Error in ref_grid(object, ...) : Can't handle an object of class “list”
Use help("models", package = "emmeans") for information on supported models.
所以我尝试了不同的方法
anova2 <- aov(cbind(Protein1,Protein2,Protein3)~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
#Pairwise comparison using emmeans.
#I got only result for the whole dataset, instead of by different types of protein.
wt_emm2 <- emmeans(anova2, "Treatment")
pairs(wt_emm2)
> pairs(wt_emm2)
contrast estimate SE df t.ratio p.value
A - B -1.704 1.05 10 -1.630 0.2782
A - C 0.865 1.05 10 0.827 0.6955
B - C 2.569 1.05 10 2.458 0.0793
我不明白为什么即使我在方差分析模型中使用了"cbind(Protein1, Protein2, Protein3)"。 R 仍然只给我一个结果,而不是像下面这样的结果
this is what I was hoping to get
> Protein1
contrast
A - B
A - C
B - C
> Protein2
contrast
A - B
A - C
B - C
> Protein3
contrast
A - B
A - C
B - C
我该如何编码或者我应该尝试不同的 package/function?
我一次 运行 处理一种蛋白质没有问题。但是,由于我有超过100种蛋白质运行,一个一个编码真的很费时间。
如有任何建议,我们将不胜感激。谢谢!
按列名循环函数。
responseList <- names(DF)[c(3:5)]
for(n in responseList) {
anova2 <- aov(get(n) ~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
wt_emm2 <- emmeans(anova2, "Treatment")
print(pairs(wt_emm2))
}
这个returns
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -1.41 1.26 10 -1.122 0.5229
A - C 1.31 1.26 10 1.039 0.5705
B - C 2.72 1.26 10 2.161 0.1269
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -2.16 1.37 10 -1.577 0.2991
A - C 1.19 1.37 10 0.867 0.6720
B - C 3.35 1.37 10 2.444 0.0810
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -1.87 1.19 10 -1.578 0.2988
A - C 1.28 1.19 10 1.077 0.5485
B - C 3.15 1.19 10 2.655 0.0575
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
如果您想将输出作为列表:
responseList <- names(DF)[c(3:5)]
output <- list()
for(n in responseList) {
anova2 <- aov(get(n) ~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
wt_emm2 <- emmeans(anova2, "Treatment")
output[[n]] <- pairs(wt_emm2)
}
这里
#Pairwise comparison using emmeans. This did not work
wt_emm <- emmeans(modelList, "Treatment")
您需要像 lapply(modelList, summary)
一样 lapply
遍历列表
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
aov(mF, data = DF)
})
但是这样做的时候出现错误:
lapply(modelList, function(x) pairs(emmeans(x, "Treatment")))
Note: re-fitting model with sum-to-zero contrasts
Error in terms(formula, "Error", data = data) : object 'mF' not found
attr(modelList[[1]], 'call')$formula
# mF
请注意 mF
是 formula
对象的名称,因此 emmeans
似乎出于某种原因需要原始公式。您可以将公式添加到调用中:
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
av <- aov(mF, data = DF)
attr(av, 'call')$formula <- mF
av
})
lapply(modelList, function(x) pairs(emmeans(x, "Treatment")))
# [[1]]
# contrast estimate SE df t.ratio p.value
# A - B -1.89 1.26 10 -1.501 0.3311
# A - C 1.08 1.26 10 0.854 0.6795
# B - C 2.97 1.26 10 2.356 0.0934
#
# P value adjustment: tukey method for comparing a family of 3 estimates
#
# [[2]]
# contrast estimate SE df t.ratio p.value
# A - B -1.44 1.12 10 -1.282 0.4361
# A - C 1.29 1.12 10 1.148 0.5082
# B - C 2.73 1.12 10 2.430 0.0829
#
# P value adjustment: tukey method for comparing a family of 3 estimates
#
# [[3]]
# contrast estimate SE df t.ratio p.value
# A - B -1.58 1.15 10 -1.374 0.3897
# A - C 1.27 1.15 10 1.106 0.5321
# B - C 2.85 1.15 10 2.480 0.0765
#
# P value adjustment: tukey method for comparing a family of 3 estimates
我正在处理一个数据集,其中包含几种不同类型的蛋白质作为列。它有点像这样 This is simplified, the original dataset contains over 100 types of proteins。我想看看在考虑随机效应 (=id) 时,蛋白质的浓度是否因处理而不同。我设法一次 运行 多次重复方差分析。但我也想根据治疗对所有蛋白质进行成对比较。我想到的第一件事是使用 emmeans 包,但我在编码时遇到了麻烦。
#install packages
library(tidyverse)
library(emmeans)
#Create a data set
set.seed(1)
id <- rep(c("1","2","3","4","5","6"),3)
Treatment <- c(rep(c("A"), 6), rep(c("B"), 6),rep(c("C"), 6))
Protein1 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein2 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein3 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
DF <- data.frame(id, Treatment, Protein1, Protein2, Protein3) %>%
mutate(id = factor(id),
Treatment = factor(Treatment, levels = c("A","B","C")))
#First, I tried to run multiple anova, by using lapply
responseList <- names(DF)[c(3:5)]
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
aov(mF, data = DF)
})
lapply(modelList, summary)
#Pairwise comparison using emmeans. This did not work
wt_emm <- emmeans(modelList, "Treatment")
> wt_emm <- emmeans(modelList, "Treatment")
Error in ref_grid(object, ...) : Can't handle an object of class “list”
Use help("models", package = "emmeans") for information on supported models.
所以我尝试了不同的方法
anova2 <- aov(cbind(Protein1,Protein2,Protein3)~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
#Pairwise comparison using emmeans.
#I got only result for the whole dataset, instead of by different types of protein.
wt_emm2 <- emmeans(anova2, "Treatment")
pairs(wt_emm2)
> pairs(wt_emm2)
contrast estimate SE df t.ratio p.value
A - B -1.704 1.05 10 -1.630 0.2782
A - C 0.865 1.05 10 0.827 0.6955
B - C 2.569 1.05 10 2.458 0.0793
我不明白为什么即使我在方差分析模型中使用了"cbind(Protein1, Protein2, Protein3)"。 R 仍然只给我一个结果,而不是像下面这样的结果
this is what I was hoping to get
> Protein1
contrast
A - B
A - C
B - C
> Protein2
contrast
A - B
A - C
B - C
> Protein3
contrast
A - B
A - C
B - C
我该如何编码或者我应该尝试不同的 package/function?
我一次 运行 处理一种蛋白质没有问题。但是,由于我有超过100种蛋白质运行,一个一个编码真的很费时间。
如有任何建议,我们将不胜感激。谢谢!
按列名循环函数。
responseList <- names(DF)[c(3:5)]
for(n in responseList) {
anova2 <- aov(get(n) ~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
wt_emm2 <- emmeans(anova2, "Treatment")
print(pairs(wt_emm2))
}
这个returns
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -1.41 1.26 10 -1.122 0.5229
A - C 1.31 1.26 10 1.039 0.5705
B - C 2.72 1.26 10 2.161 0.1269
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -2.16 1.37 10 -1.577 0.2991
A - C 1.19 1.37 10 0.867 0.6720
B - C 3.35 1.37 10 2.444 0.0810
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
Note: re-fitting model with sum-to-zero contrasts
Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
contrast estimate SE df t.ratio p.value
A - B -1.87 1.19 10 -1.578 0.2988
A - C 1.28 1.19 10 1.077 0.5485
B - C 3.15 1.19 10 2.655 0.0575
Note: contrasts are still on the get scale
P value adjustment: tukey method for comparing a family of 3 estimates
如果您想将输出作为列表:
responseList <- names(DF)[c(3:5)]
output <- list()
for(n in responseList) {
anova2 <- aov(get(n) ~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
wt_emm2 <- emmeans(anova2, "Treatment")
output[[n]] <- pairs(wt_emm2)
}
这里
#Pairwise comparison using emmeans. This did not work
wt_emm <- emmeans(modelList, "Treatment")
您需要像 lapply(modelList, summary)
lapply
遍历列表
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
aov(mF, data = DF)
})
但是这样做的时候出现错误:
lapply(modelList, function(x) pairs(emmeans(x, "Treatment")))
Note: re-fitting model with sum-to-zero contrasts Error in terms(formula, "Error", data = data) : object 'mF' not found
attr(modelList[[1]], 'call')$formula
# mF
请注意 mF
是 formula
对象的名称,因此 emmeans
似乎出于某种原因需要原始公式。您可以将公式添加到调用中:
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
av <- aov(mF, data = DF)
attr(av, 'call')$formula <- mF
av
})
lapply(modelList, function(x) pairs(emmeans(x, "Treatment")))
# [[1]]
# contrast estimate SE df t.ratio p.value
# A - B -1.89 1.26 10 -1.501 0.3311
# A - C 1.08 1.26 10 0.854 0.6795
# B - C 2.97 1.26 10 2.356 0.0934
#
# P value adjustment: tukey method for comparing a family of 3 estimates
#
# [[2]]
# contrast estimate SE df t.ratio p.value
# A - B -1.44 1.12 10 -1.282 0.4361
# A - C 1.29 1.12 10 1.148 0.5082
# B - C 2.73 1.12 10 2.430 0.0829
#
# P value adjustment: tukey method for comparing a family of 3 estimates
#
# [[3]]
# contrast estimate SE df t.ratio p.value
# A - B -1.58 1.15 10 -1.374 0.3897
# A - C 1.27 1.15 10 1.106 0.5321
# B - C 2.85 1.15 10 2.480 0.0765
#
# P value adjustment: tukey method for comparing a family of 3 estimates