具有负二项式 glmer 的模型收敛警告

Model convergence warning with negative binomial glmer

我读过几篇关于 glmer 收敛问题的帖子,并且尝试了一些推荐的解决方法(更改优化器、更改模型迭代等),但似乎没有任何方法可以解决我的收敛问题。我想知道是否有人可以帮助我找出我做错了什么?

数据说明:该数据为帝王蝶在一年内不同地点历时数年的统计数据。该数据有超过 6000 个观测值,但观测值和站点的数量因年份而异。数据有几个零(没有观察到君主),数据过于分散。

计数:计算了多少人的响应变量

年份:具有 6 个水平的因子

阶段:具有 7 个级别的因素

站点:随机分类变量。列为随机变量,因为每个站点都是一年内的重复

数据汇总:

       X            siteID                             site        year             date            stage    
 Min.   : 977   Min.   :2538   Groesbeck Field           : 581   2015: 817   7/22/2016:  91   adult    :976  
 1st Qu.:2696   1st Qu.:2541   Rain Garden               : 490   2016:1258   7/1/2016 :  77   no_eggs  :927  
 Median :4428   Median :2546   Preschool Butterfly Garden: 435   2017: 937   7/29/2016:  77   no_fifth :927  
 Mean   :4419   Mean   :2971   Spring Pond               : 427   2018:1710   8/12/2016:  77   no_first :927  
 3rd Qu.:6144   3rd Qu.:2892   Whitetail field           : 406   2019:1116   8/25/2016:  77   no_fourth:927  
 Max.   :7808   Max.   :6411   Pollinator Mound          : 387   2020: 700   (Other)  :6090   no_second:927  
                               (Other)                   :3812               NA's     :  49   no_third :927  
     count        
 Min.   : 0.0000  
 1st Qu.: 0.0000  
 Median : 0.0000  
 Mean   : 0.6422  
 3rd Qu.: 0.0000  
 Max.   :66.0000



head(monarch_long, n=20)
      X siteID                site year      date     stage count
1  1505   2541     Groesbeck Field 2018 8/21/2018   no_eggs    66
2  1748   2541     Groesbeck Field 2019 8/12/2019   no_eggs    53
3  1365   2543         Rain Garden 2017  8/7/2017   no_eggs    51
4  1591   2543         Rain Garden 2018 8/21/2018   no_eggs    49
5  1504   2541     Groesbeck Field 2018 8/15/2018   no_eggs    47
6  1469   2546    Butterfly Garden 2018  8/9/2018   no_eggs    46
7   981   2561    Barred Owl Trail 2015  8/5/2015   no_eggs    45
8  2447   2546    Butterfly Garden 2018 8/24/2018  no_first    44
9  3424   2546    Butterfly Garden 2018 8/31/2018 no_second    40
10 1503   2541     Groesbeck Field 2018  8/7/2018   no_eggs    38
11 3423   2546    Butterfly Garden 2018 8/24/2018 no_second    38
12 1021   2541     Groesbeck Field 2015 8/20/2015   no_eggs    37
13 4400   2546    Butterfly Garden 2018 8/31/2018  no_third    33
14 1265   2538 Wetland Restoration 2016 8/25/2016   no_eggs    32
15 1749   2541     Groesbeck Field 2019 8/23/2019   no_eggs    32
16 1470   2546    Butterfly Garden 2018 8/17/2018   no_eggs    31
17 1471   2546    Butterfly Garden 2018 8/24/2018   no_eggs    30
18 1034   2559 Parking Lot Islands 2015 8/11/2015   no_eggs    28
19 1588   2543         Rain Garden 2018  8/2/2018   no_eggs    27
20 6275   2892         Vernal Pool 2017 9/20/2017  no_fifth    27```

代码:

glmer.nb(count~year+stage+(1|site),  
         control=glmerControl(optimizer="bobyqa",
         optCtrl=list(maxfun=2e5)),
                   data=monarch_long)

Warning message: Warning message: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0123119 (tol = 0.002, component 1)

如果我需要提供更多详细信息,请告诉我。任何建议将不胜感激!

tl;dr 我觉得你的身材真的很好。对于负二项式 GLMM,我现在开始推荐 glmmTMB 而不是 lme4::glmer.nb。在任何情况下,它都适用于使用完全不同的模型 implementation/algorithm 检查模型参数并确保答案相同,这是解决收敛警告的黄金标准 ...

library(lme4)
system.time(m1 <- glmer.nb(count~year+stage+(1|site),  
               control=glmerControl(optimizer="bobyqa",
                                    optCtrl=list(maxfun=2e5)),
               data=monarch_long))

改装 glmmTMB(使用 family="nbinom2"

library(glmmTMB)
system.time(m2 <- glmmTMB(count~year+stage+(1|site),  
              family="nbinom2",
              data=monarch_long))
## if you have multiple cores you can speed things up further ...
system.time(m3 <- glmmTMB(count~year+stage+(1|site),  
                          family="nbinom2",
                          control=glmmTMBControl(parallel=5),
                          data=monarch_long)) ## 10 seconds

比较对数似然(与 logLik())表明 glmmTMB 实际上稍微好一点(0.2 对数似然单位),但无论如何,系数估计值非常相似(图形比较仅固定效应,但方差估计也相差 <1%):

library(ggplot2)
library(dotwhisker)
library(broom.mixed) 
dwplot(list(glmer.nb=m1,glmmTMB=m2),effect="fixed") + theme_bw() +
    geom_vline(xintercept=0,lty=2)

数据设置

monarch_long <- read.csv("MLMP_2020\ long\ data\ -\ Sheet1.csv")
monarch_long$year <- factor(monarch_long$year)
monarch_long$stage <- factor(monarch_long$stage,
                             levels=c(paste0("no_",
              c("eggs","first","second","third","fourth","fifth")),"adult"))