R中MANCOVA的估计边际均值

Question

我构建了一个考虑协变量的模型。有两个因变量（"A"、"B"）和两个自变量（"C"、"D"）和一个连续协变量（"E"）。我运行一个MANCOVA如下：

x<-cbind(A,B) #combining dependent variables
y<-cbind(C,D) #combining independent variables
fit<-manova(x~y+E)
summary(fit, test="Pillai")

这一切都很完美，我发现协变量对因变量有影响。因此，我想使用 emmeans 包来解释具有估计边际均值的协方差。但是，当我尝试运行以下代码时，我收到此错误：

library(emmeans)
emmeans(fit,~y+E)

>Error in eval(expr, envir, enclos) : object 'spc.l$Ghopper.Start..g.' not 
found
>Error in ref_grid(object, ...) : Perhaps a 'data' or 'params' argument is 
needed

这是我的数据：

 structure(list(ï..insect = c(105L, 106L, 107L, 108L, 110L, 112L, 
113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 
125L, 126L, 127L, 128L), C = structure(c(1L, 2L, 1L, 2L, 2L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L
), .Label = c("Pair A 7p:35c-35p:7c", "Pair B 7p:35c-28p:14c"
), class = "factor"), D = structure(c(1L, 1L, 2L, 2L, 1L, 2L, 
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("F", 
"M"), class = "factor"), E = c(0.357, 0.259, 0.128, 0.104, 0.248, 
0.111, 0.218, 0.213, 0.13, 0.123, 0.335, 0.22, 0.247, 0.295, 
0.297, 0.219, 0.132, 0.194, 0.207, 0.266, 0.234), A = c(0.025333333, 
0.041666665, 0.043833332, 0.046333331, 0.108499995, 0.051999997, 
0.101833329, 0.06083333, 0.059499998, 0.056166664, 0.017833333, 
0.053666664, 0.066333331, 0.025499998, 0.073666664, 0.149333324, 
0.044666665, 0.047499998, 0.051833331, 0.020499999, 0.062499997
), B = c(0.050666667, 0.020333321, 0.023166668, 0.029666645, 
0.032499992, 0.028999981, 0.029166671, 0.024166656, 0.025500002, 
0.020833325, 0.021166667, 0.038333304, 0.023666669, 0.022499981, 
0.040333336, 0.121666569, 0.023333335, 0.017500002, 0.01816666, 
0.018500001, 0.024499989)), .Names = c("ï..insect", "C", "D", 
"E", "A", "B"), class = "data.frame", row.names = c(NA, -21L))

我确信这个问题有一个简单的修复方法，但我有点迷茫，stackexchange 上发布的关于 emmeans 的问题很少！

Answer 1

这是计算上可行的东西：

R> fit = manova(x ~ cbind(C, D) + E, data = dat)
R> ref_grid(fit)
'emmGrid' object with variables:
    C = 1.5238
    D = 1.2857
    E = 0.21605
    rep.meas = multivariate response levels: A, B
R> emmeans(fit, ~ C + D + E)
       C        D         E     emmean          SE df   lower.CL   upper.CL
 1.52381 1.285714 0.2160476 0.04438095 0.005377854 17 0.03303467 0.05572723

Results are averaged over the levels of: rep.meas 
Confidence level used: 0.95

我稍后会详细介绍这些结果。因此，在模型调用中用 cbind(C, D) 替换 y 将使其工作（计算）。直接使用y，我得到一个错误信息：

R> fit0 = manova(x ~ y+E, data = dat)
R> ref_grid(fit0)
Error in model.matrix(trms, m, contrasts.arg = object$contrasts)[, nm,  : 
  subscript out of bounds

这与 OP 中显示的错误消息不同，我只能猜测范围界定有所不同。但这里发生的是 C 和 D 实际上是因子，而不是数字预测变量。 cbind(C,D) 将它们转换为具有两列的数字矩阵。我需要调查和更正该错误的一些技术原因。

但是 重要的 事情是 fit 和 fit0 都不是您想要用于 post-hoc 比较的模型，因为毕竟, C 和 D 是因子。 fit 的参考网格和 EMM 基于 C 和 D 的数字重新编码的 平均值 值，以及平均 E。这就是为什么只有一行 emmeans 输出。

我认为需要的是将 cbind 从通话中取出：

R> fit1 = manova(x ~ C + D + E, data = dat)
R> summary(fit1)
          Df  Pillai approx F num Df den Df  Pr(>F)
C          1 0.07083   0.6098      2     16 0.55559
D          1 0.03150   0.2602      2     16 0.77408
E          1 0.37794   4.8606      2     16 0.02242
Residuals 17                                       
R> ref_grid(fit1)
'emmGrid' object with variables:
    C = Pair A 7p:35c-35p:7c, Pair B 7p:35c-28p:14c
    D = F, M
    E = 0.21605
    rep.meas = multivariate response levels: A, B

由于 E 是数字并且仅简化为它的平均值，因此您无需将其包含在 emmeans 调用中：

R> emmeans(fit1, ~ C + D)
 C                     D     emmean          SE df    lower.CL   upper.CL
 Pair A 7p:35c-35p:7c  F 0.05021337 0.011726789 17  0.02547201 0.07495473
 Pair B 7p:35c-28p:14c F 0.05576090 0.008808415 17  0.03717677 0.07434503
 Pair A 7p:35c-35p:7c  M 0.01962942 0.016299910 17 -0.01476039 0.05401922
 Pair B 7p:35c-28p:14c M 0.02517695 0.019816637 17 -0.01663250 0.06698640

Results are averaged over the levels of: rep.meas 
Confidence level used: 0.95

这些结果是因变量中两次重复测量的平均值。您可能想将它们分开，或者对其他一些因素的水平进行平均。由于 C 和 D 都不重要，我将只获取 rep.meas:

的 EMM 及其比较

R> emmeans(fit1, pairwise ~ rep.meas)
$emmeans
 rep.meas     emmean          SE df   lower.CL   upper.CL
 A        0.04551606 0.008547982 17 0.02748139 0.06355073
 B        0.02987426 0.006885079 17 0.01534801 0.04440050

Results are averaged over the levels of: C, D 
Confidence level used: 0.95 

$contrasts
 contrast   estimate          SE df t.ratio p.value
 A - B    0.01564181 0.005645167 17   2.771  0.0131

Results are averaged over the levels of: C, D

由于 E 显着并且是一个协变量，因此查看 rep.meas 的每个水平是否有不同的斜率可能会很有趣：

R> emtrends(fit1, pairwise ~ rep.meas, var = "E")
$emtrends
 rep.meas    E.trend        SE df   lower.CL   upper.CL
 A        -0.3472133 0.1737591 17 -0.7138129 0.01938632
 B         0.0213882 0.1399564 17 -0.2738940 0.31667042

Results are averaged over the levels of: C, D 
Confidence level used: 0.95 

$contrasts
 contrast   estimate        SE df t.ratio p.value
 A - B    -0.3686015 0.1147521 17  -3.212  0.0051

Results are averaged over the levels of: C, D

另外，试试这个以获得一个很好的预测图（结果未显示）：

emmip(fit1, rep.meas ~ E|C*D, at=list(E = c(.1,.35)))

R中MANCOVA的估计边际均值

estimated marginal means of a MANCOVA in R

statistics

r

manova

emmeans