R中MANCOVA的估计边际均值

estimated marginal means of a MANCOVA in R

我构建了一个考虑协变量的模型。有两个因变量("A"、"B")和两个自变量("C"、"D")和一个连续协变量("E")。我运行一个MANCOVA如下:

x<-cbind(A,B) #combining dependent variables
y<-cbind(C,D) #combining independent variables
fit<-manova(x~y+E)
summary(fit, test="Pillai")

这一切都很完美,我发现协变量对因变量有影响。因此,我想使用 emmeans 包来解释具有估计边际均值的协方差。但是,当我尝试 运行 以下代码时,我收到此错误:

library(emmeans)
emmeans(fit,~y+E)

>Error in eval(expr, envir, enclos) : object 'spc.l$Ghopper.Start..g.' not 
found
>Error in ref_grid(object, ...) : Perhaps a 'data' or 'params' argument is 
needed

这是我的数据:

 structure(list(ï..insect = c(105L, 106L, 107L, 108L, 110L, 112L, 
113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 
125L, 126L, 127L, 128L), C = structure(c(1L, 2L, 1L, 2L, 2L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L
), .Label = c("Pair A 7p:35c-35p:7c", "Pair B 7p:35c-28p:14c"
), class = "factor"), D = structure(c(1L, 1L, 2L, 2L, 1L, 2L, 
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("F", 
"M"), class = "factor"), E = c(0.357, 0.259, 0.128, 0.104, 0.248, 
0.111, 0.218, 0.213, 0.13, 0.123, 0.335, 0.22, 0.247, 0.295, 
0.297, 0.219, 0.132, 0.194, 0.207, 0.266, 0.234), A = c(0.025333333, 
0.041666665, 0.043833332, 0.046333331, 0.108499995, 0.051999997, 
0.101833329, 0.06083333, 0.059499998, 0.056166664, 0.017833333, 
0.053666664, 0.066333331, 0.025499998, 0.073666664, 0.149333324, 
0.044666665, 0.047499998, 0.051833331, 0.020499999, 0.062499997
), B = c(0.050666667, 0.020333321, 0.023166668, 0.029666645, 
0.032499992, 0.028999981, 0.029166671, 0.024166656, 0.025500002, 
0.020833325, 0.021166667, 0.038333304, 0.023666669, 0.022499981, 
0.040333336, 0.121666569, 0.023333335, 0.017500002, 0.01816666, 
0.018500001, 0.024499989)), .Names = c("ï..insect", "C", "D", 
"E", "A", "B"), class = "data.frame", row.names = c(NA, -21L))

我确信这个问题有一个简单的修复方法,但我有点迷茫,stackexchange 上发布的关于 emmeans 的问题很少!

这是计算上可行的东西:

R> fit = manova(x ~ cbind(C, D) + E, data = dat)
R> ref_grid(fit)
'emmGrid' object with variables:
    C = 1.5238
    D = 1.2857
    E = 0.21605
    rep.meas = multivariate response levels: A, B
R> emmeans(fit, ~ C + D + E)
       C        D         E     emmean          SE df   lower.CL   upper.CL
 1.52381 1.285714 0.2160476 0.04438095 0.005377854 17 0.03303467 0.05572723

Results are averaged over the levels of: rep.meas 
Confidence level used: 0.95

我稍后会详细介绍这些结果。因此,在模型调用中用 cbind(C, D) 替换 y 将使其工作(计算)。直接使用y,我得到一个错误信息:

R> fit0 = manova(x ~ y+E, data = dat)
R> ref_grid(fit0)
Error in model.matrix(trms, m, contrasts.arg = object$contrasts)[, nm,  : 
  subscript out of bounds

这与 OP 中显示的错误消息不同,我只能猜测范围界定有所不同。但这里发生的是 CD 实际上是因子,而不是数字预测变量。 cbind(C,D) 将它们转换为具有两列的数字矩阵。我需要调查和更正该错误的一些技术原因。

但是 重要的 事情是 fitfit0 都不是您想要用于 post-hoc 比较的模型,因为毕竟, CD 是因子。 fit 的参考网格和 EMM 基于 CD 的数字重新编码的 平均值 值,以及平均 E。这就是为什么只有一行 emmeans 输出。

我认为需要的是将 cbind 从通话中取出:

R> fit1 = manova(x ~ C + D + E, data = dat)
R> summary(fit1)
          Df  Pillai approx F num Df den Df  Pr(>F)
C          1 0.07083   0.6098      2     16 0.55559
D          1 0.03150   0.2602      2     16 0.77408
E          1 0.37794   4.8606      2     16 0.02242
Residuals 17                                       
R> ref_grid(fit1)
'emmGrid' object with variables:
    C = Pair A 7p:35c-35p:7c, Pair B 7p:35c-28p:14c
    D = F, M
    E = 0.21605
    rep.meas = multivariate response levels: A, B

由于 E 是数字并且仅简化为它的平均值,因此您无需将其包含在 emmeans 调用中:

R> emmeans(fit1, ~ C + D)
 C                     D     emmean          SE df    lower.CL   upper.CL
 Pair A 7p:35c-35p:7c  F 0.05021337 0.011726789 17  0.02547201 0.07495473
 Pair B 7p:35c-28p:14c F 0.05576090 0.008808415 17  0.03717677 0.07434503
 Pair A 7p:35c-35p:7c  M 0.01962942 0.016299910 17 -0.01476039 0.05401922
 Pair B 7p:35c-28p:14c M 0.02517695 0.019816637 17 -0.01663250 0.06698640

Results are averaged over the levels of: rep.meas 
Confidence level used: 0.95

这些结果是因变量中两次重复测量的平均值。您可能想将它们分开,或者对其他一些因素的水平进行平均。由于 CD 都不重要,我将只获取 rep.meas:

的 EMM 及其比较
R> emmeans(fit1, pairwise ~ rep.meas)
$emmeans
 rep.meas     emmean          SE df   lower.CL   upper.CL
 A        0.04551606 0.008547982 17 0.02748139 0.06355073
 B        0.02987426 0.006885079 17 0.01534801 0.04440050

Results are averaged over the levels of: C, D 
Confidence level used: 0.95 

$contrasts
 contrast   estimate          SE df t.ratio p.value
 A - B    0.01564181 0.005645167 17   2.771  0.0131

Results are averaged over the levels of: C, D

由于 E 显着并且是一个协变量,因此查看 rep.meas 的每个水平是否有不同的斜率可能会很有趣:

R> emtrends(fit1, pairwise ~ rep.meas, var = "E")
$emtrends
 rep.meas    E.trend        SE df   lower.CL   upper.CL
 A        -0.3472133 0.1737591 17 -0.7138129 0.01938632
 B         0.0213882 0.1399564 17 -0.2738940 0.31667042

Results are averaged over the levels of: C, D 
Confidence level used: 0.95 

$contrasts
 contrast   estimate        SE df t.ratio p.value
 A - B    -0.3686015 0.1147521 17  -3.212  0.0051

Results are averaged over the levels of: C, D

另外,试试这个以获得一个很好的预测图(结果未显示):

emmip(fit1, rep.meas ~ E|C*D, at=list(E = c(.1,.35)))