stepcAIC - eval(predvars, data, env) 错误：未找到对象 'Color1'

Question

我想 select 我的混合效果模型的最佳运行dom 结构（从 lme4 中安装 lmer()）。我从包 cAIC4 中找到函数 stepcAIC()，它应该逐步比较模型和 select 具有最小 AIC 的模型。虽然实现看起来很简单，但我得到了一个错误。

拟合我的模型后，我运行以下函数：

stepcAIC(model_full, direction="backward")

所以首先 - 运行需要很长时间。第二 - 我收到一条错误消息。我尝试明确指定数据集：

stepcAIC(model_full, direction="backward", data=data_correct)

我也尝试将 R 更新到最新版本，然后再次运行，但没有帮助。

有没有人对这个功能有积极的经验来告诉我我做错了什么？

我得到的错误是这样的：

Error in eval(predvars, data, env) : object 'Color1' not found

我有一个名为 "Color" 的变量，但不是 "Color1"。也许 "Color1" 是取自效果 table 的名称，但为什么它会使用摘要 table 中的名称并在数据框中搜索它？

我也收到警告：

In if (!hasInt(resForThisGroup)) res[[i]] <- res[[i]][-j] : the condition has length > 1 and only the first element will be used

这是一个[link](https://drive.google.com/open?id=1jIJn2rzK3SwpKMfKGDhseYcOxinuwpue ) 下载 data_correct 和 model_full:

我就是这样创建的 model_full:

model_full <- lmer(data=data_correct, log_RT~Polarity+Delay+Truth_value+Type+Color+Order + Polarity:Delay + Polarity:Truth_value + Polarity:Order + Polarity:Type+ Polarity:Color + Delay:Truth_value+ Truth_value:Delay:Polarity + (1+Polarity*Color+Delay+Delay:Polarity+Truth_value|Subject), control=lmerControl(optimizer="bobyqa"), REML=FALSE)

这是model_full的输出：

Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log_RT ~ Polarity + Delay + Truth_value + Type + Color + Order +  
    Polarity:Delay + Polarity:Truth_value + Polarity:Order +  
    Polarity:Type + Polarity:Color + Delay:Truth_value + Truth_value:Delay:Polarity +  
    (1 + Polarity * Color + Delay + Delay:Polarity + Truth_value |          Subject)
   Data: data_correct
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 16556.6  16896.2  -8235.3  16470.6    19838 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9078 -0.6585 -0.1065  0.5654  6.5045 

Random effects:
 Groups   Name             Variance  Std.Dev. Corr                               
 Subject  (Intercept)      0.0652479 0.25544                                     
          Polarity1        0.0045472 0.06743   0.51                              
          Color1           0.0030415 0.05515   0.15  0.13                        
          Delay1           0.0005240 0.02289   0.22 -0.05 -0.02                  
          Truth_value1     0.0022027 0.04693   0.00  0.48  0.23  0.00            
          Polarity1:Color1 0.0003927 0.01982   0.04 -0.33  0.57 -0.50 -0.12      
          Polarity1:Delay1 0.0001981 0.01408   0.61  0.07  0.06  0.55  0.06 -0.04
 Residual                  0.1304137 0.36113                                     
Number of obs: 19881, groups:  Subject, 38

Fixed effects:
                                Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)                    6.572e+00  4.152e-02  3.800e+01 158.301  < 2e-16 ***
Polarity1                      1.234e-01  1.124e-02  3.797e+01  10.985 2.38e-13 ***
Delay1                        -6.476e-02  4.512e-03  3.817e+01 -14.352  < 2e-16 ***
Truth_value1                   5.266e-02  8.034e-03  3.805e+01   6.556 9.83e-08 ***
Type1                          7.531e-03  2.562e-03  1.962e+04   2.939 0.003292 ** 
Color1                         2.512e-02  9.308e-03  3.756e+01   2.698 0.010379 *  
Order1                        -3.524e-02  8.981e-03  3.794e+01  -3.924 0.000354 ***
Polarity1:Delay1              -2.244e-02  3.433e-03  3.834e+01  -6.538 1.00e-07 ***
Polarity1:Truth_value1        -5.728e-02  2.563e-03  1.963e+04 -22.347  < 2e-16 ***
Polarity1:Order1              -1.250e-02  3.547e-03  3.823e+01  -3.525 0.001119 ** 
Polarity1:Type1               -7.107e-03  2.562e-03  1.962e+04  -2.774 0.005544 ** 
Polarity1:Color1               4.012e-03  4.114e-03  3.790e+01   0.975 0.335639    
Delay1:Truth_value1            5.301e-03  2.563e-03  1.963e+04   2.068 0.038629 *  
Polarity1:Delay1:Truth_value1  9.625e-03  2.563e-03  1.963e+04   3.755 0.000174 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Answer 1

（只是一个回答；如果合适，稍后会删除。）

我无法复制你的问题，因为你的数据集对于我目前正在使用的机器来说太大了；当我尝试运行 stepcAIC(model_full, direction="backward") 我得到：

The cAIC of the initial model can not be calculated.

来自 cAIC(model_full) 的消息对此进行了解释：

Error: cannot allocate vector of size 2.9 Gb

这也许并不奇怪，因为模型中等大（~20K 个观测值，28 个参数）。（深入研究代码，我们可以看到该模型试图构建一个 dense 单位矩阵，其维度等于观察的数量 - 在这种情况下 n * n * 8 bytes 接近 3 Gb ...)

只有当你想要select基于个体水平预测的模型时，才真正需要计算cAIC；如果你想 select 基于 人口水平 预测，AIC 应该是可以接受的（并且在计算上更便宜）。最简单的 selection 过程基于 p 值（我不喜欢它，因为我认为建模决策不应该基于显着性检验，但很多人使用它）。

lmerTest 中的 step() 函数将基于 p 值向后 selection:

system.time(ss <- step(model_full,reduce.fixed=FALSE))

在我的旧笔记本电脑上大约需要 4.5 分钟。结果（缩写）是它测试了从随机效应中删除 Truth_value、Polarity:Color 和 Polarity:Delay 的效果，并得出结论认为它不应该删除其中任何一个。

Backward reduced random-effect table:

                     Eliminated npar  logLik   AIC     LRT Df Pr(>Chisq)    
<none>                            43 -8235.3 16557                          
T_i(1+P*C+D+D:P+T_|S          0   36 -8366.3 16804 261.915  7  < 2.2e-16 ***
P:Ci(1+P*C+D+D:P+T|S          0   36 -8257.1 16586  43.693  7  2.451e-07 ***
P:Di(1+P*C+D+D:P+T|S          0   36 -8245.0 16562  19.507  7   0.006739 ** 
---

?step.lmerModLmerTest

... a column ‘"Eliminated"’ indicating the order in which terms are eliminated from the model with zero (‘0’) indicating that the term is not eliminated from the model.

在这种情况下，step() 函数已尝试删除所有最高阶项（双向交互作用 + Truth_value 的主效应，不参与交互作用），并发现它不想删除其中任何一个。在这种情况下，p 值标准（所有项的 p<0.05）和 AIC 标准（所有简化模型的 AIC 均大于原始模型）彼此一致。

stepcAIC - eval(predvars, data, env) 错误：未找到对象 'Color1'

stepcAIC - Error in eval(predvars, data, env) : object 'Color1' not found

r

lme4