emmeans

Question

我正在使用 emmeans 对对照组进行自定义比较。如果我只对比较一个因素感兴趣，那么 trt.vs.ctrl 方法非常适合我，但是当我将比较设置为更复杂时（即控制组由特定的2+ 个变量的组合）。

示例代码如下。假设使用 pigs 数据，我想将所有饮食与低百分比鱼类饮食进行比较。请注意，在 nd 数据框中，"fish" 只有 9% 与之关联。但是，当我运行 emmeans 时，该函数不会在嵌套上拾取，虽然控制是正确的，但处理组还包括鱼和百分比的各种值。这意味着 p-value 调整是错误的。

所以我能想到的两种方法：

在这种情况下，如何让 emmeans 接受嵌套，或者
如何手动进行 dunnettx 调整（=我可以使用调整 "none"，然后拉出我真正想要的测试，然后自己调整 p 值？）。

    library(emmeans)
    library(dplyr)

    pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
    nd <- expand.grid(source = levels(pigs$source), percent = unique(pigs$percent)) %>%
        filter(percent == 9 | source != "fish")

    emmeans(pigs.lm, trt.vs.ctrl ~ source + percent, 
        data = nd, covnest = TRUE, cov.reduce = FALSE)

感谢您的帮助。

使用 include 的建议非常有效。在这里发布我的代码以防其他人将来遇到同样的问题。

library(emmeans)
library(dplyr)
library(tidyr)

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
nd <- expand.grid(source = levels(pigs$source), percent =     unique(pigs$percent)) %>%
    filter(percent == 9 | source != "fish")

ems <- emmeans(pigs.lm, trt.vs.ctrl ~ source + percent, 
    data = nd, covnest = TRUE, cov.reduce = FALSE)

# to identify which levels to exclude - in this case, 
# I only want the low-percent fish to remain as the ref level
aux <- as.data.frame(ems[[1]]) %>%
    mutate(ID = 1:n()) %>%
    filter(!grepl("fish", source) | ID == 1)

emmeans(pigs.lm, trt.vs.ctrl ~ source + percent, 
    data = nd, covnest = TRUE, cov.reduce = FALSE, include = aux$ID)

Answer 1

我不是很清楚你想要完成什么，但我不认为过滤数据是解决方案。

如果您的目标是将 source 的边际均值与（鱼，9%）组合进行比较，您可以通过构造两组 emmeans，然后进行子集化和组合来实现：

emm1 = emmeans(pigs.lm, "source")
emm2 = emmeans(pigs.lm, ~source*percent)
emm3 = emm2[1] + emm1      # or rbind(emm2[1], emm1)

然后你得到

> confint(emm3, adjust ="none")
 source percent emmean     SE df lower.CL upper.CL
 fish   9         3.22 0.0536 23     3.11     3.33
 fish   .         3.39 0.0367 23     3.32     3.47
 soy    .         3.67 0.0374 23     3.59     3.74
 skim   .         3.80 0.0394 23     3.72     3.88

Results are averaged over some or all of the levels of: percent 
Results are given on the log (not the response) scale. 
Confidence level used: 0.95 

> contrast(emm3, "trt.vs.ctrl1")
 contrast        estimate     SE df t.ratio p.value
 fish,. - fish,9    0.174 0.0366 23 4.761   0.0002 
 soy,. - fish,9     0.447 0.0678 23 6.595   <.0001 
 skim,. - fish,9    0.576 0.0696 23 8.286   <.0001 

Results are averaged over some or all of the levels of: percent 
Results are given on the log (not the response) scale. 
P value adjustment: dunnettx method for 3 tests

另一种（更乏味、更容易出错）做同样事情的方法是获取因子组合的 EMM，然后使用自定义对比：

> contrast(emm2, list(con1 = c(-3,0,0, 1,0,0, 1,0,0, 1,0,0)/4,
+                     con2 = c(-4,1,0, 0,1,0, 0,1,0, 0,1,0)/4,
+                     con3 = c(-4,0,1, 0,0,1, 0,0,1, 0,0,1)/4),
+          adjust = "mvt")

 contrast estimate     SE df t.ratio p.value
 con1        0.174 0.0366 23  4.761  0.0002 
 con2        0.447 0.0678 23  6.595  <.0001 
 con3        0.576 0.0696 23  8.286  <.0001 

Results are given on the log (not the response) scale. 
P value adjustment: mvt method for 3 tests

（mvt 调整是精确校正，dunnettx 只是一个近似值。它不会默认为 mvt，因为它对于大量测试来说计算量很大。）

在回答问题的最后一部分时，您可以使用exclude（或include）来关注级别的子集；参见 ? pairwise.emmc。

emmeans - 控制与治疗不止一个因素

emmeans - control vs treatment for more than one factor

r

p-value