具有负二项式 glmer 的模型收敛警告
Model convergence warning with negative binomial glmer
我读过几篇关于 glmer 收敛问题的帖子,并且尝试了一些推荐的解决方法(更改优化器、更改模型迭代等),但似乎没有任何方法可以解决我的收敛问题。我想知道是否有人可以帮助我找出我做错了什么?
数据说明:该数据为帝王蝶在一年内不同地点历时数年的统计数据。该数据有超过 6000 个观测值,但观测值和站点的数量因年份而异。数据有几个零(没有观察到君主),数据过于分散。
计数:计算了多少人的响应变量
年份:具有 6 个水平的因子
阶段:具有 7 个级别的因素
站点:随机分类变量。列为随机变量,因为每个站点都是一年内的重复
数据汇总:
X siteID site year date stage
Min. : 977 Min. :2538 Groesbeck Field : 581 2015: 817 7/22/2016: 91 adult :976
1st Qu.:2696 1st Qu.:2541 Rain Garden : 490 2016:1258 7/1/2016 : 77 no_eggs :927
Median :4428 Median :2546 Preschool Butterfly Garden: 435 2017: 937 7/29/2016: 77 no_fifth :927
Mean :4419 Mean :2971 Spring Pond : 427 2018:1710 8/12/2016: 77 no_first :927
3rd Qu.:6144 3rd Qu.:2892 Whitetail field : 406 2019:1116 8/25/2016: 77 no_fourth:927
Max. :7808 Max. :6411 Pollinator Mound : 387 2020: 700 (Other) :6090 no_second:927
(Other) :3812 NA's : 49 no_third :927
count
Min. : 0.0000
1st Qu.: 0.0000
Median : 0.0000
Mean : 0.6422
3rd Qu.: 0.0000
Max. :66.0000
head(monarch_long, n=20)
X siteID site year date stage count
1 1505 2541 Groesbeck Field 2018 8/21/2018 no_eggs 66
2 1748 2541 Groesbeck Field 2019 8/12/2019 no_eggs 53
3 1365 2543 Rain Garden 2017 8/7/2017 no_eggs 51
4 1591 2543 Rain Garden 2018 8/21/2018 no_eggs 49
5 1504 2541 Groesbeck Field 2018 8/15/2018 no_eggs 47
6 1469 2546 Butterfly Garden 2018 8/9/2018 no_eggs 46
7 981 2561 Barred Owl Trail 2015 8/5/2015 no_eggs 45
8 2447 2546 Butterfly Garden 2018 8/24/2018 no_first 44
9 3424 2546 Butterfly Garden 2018 8/31/2018 no_second 40
10 1503 2541 Groesbeck Field 2018 8/7/2018 no_eggs 38
11 3423 2546 Butterfly Garden 2018 8/24/2018 no_second 38
12 1021 2541 Groesbeck Field 2015 8/20/2015 no_eggs 37
13 4400 2546 Butterfly Garden 2018 8/31/2018 no_third 33
14 1265 2538 Wetland Restoration 2016 8/25/2016 no_eggs 32
15 1749 2541 Groesbeck Field 2019 8/23/2019 no_eggs 32
16 1470 2546 Butterfly Garden 2018 8/17/2018 no_eggs 31
17 1471 2546 Butterfly Garden 2018 8/24/2018 no_eggs 30
18 1034 2559 Parking Lot Islands 2015 8/11/2015 no_eggs 28
19 1588 2543 Rain Garden 2018 8/2/2018 no_eggs 27
20 6275 2892 Vernal Pool 2017 9/20/2017 no_fifth 27```
代码:
glmer.nb(count~year+stage+(1|site),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)),
data=monarch_long)
Warning message: Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0123119 (tol = 0.002, component 1)
如果我需要提供更多详细信息,请告诉我。任何建议将不胜感激!
tl;dr 我觉得你的身材真的很好。对于负二项式 GLMM,我现在开始推荐 glmmTMB
而不是 lme4::glmer.nb
。在任何情况下,它都适用于使用完全不同的模型 implementation/algorithm 检查模型参数并确保答案相同,这是解决收敛警告的黄金标准 ...
library(lme4)
system.time(m1 <- glmer.nb(count~year+stage+(1|site),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)),
data=monarch_long))
改装 glmmTMB
(使用 family="nbinom2"
)
library(glmmTMB)
system.time(m2 <- glmmTMB(count~year+stage+(1|site),
family="nbinom2",
data=monarch_long))
## if you have multiple cores you can speed things up further ...
system.time(m3 <- glmmTMB(count~year+stage+(1|site),
family="nbinom2",
control=glmmTMBControl(parallel=5),
data=monarch_long)) ## 10 seconds
比较对数似然(与 logLik()
)表明 glmmTMB
实际上稍微好一点(0.2 对数似然单位),但无论如何,系数估计值非常相似(图形比较仅固定效应,但方差估计也相差 <1%):
library(ggplot2)
library(dotwhisker)
library(broom.mixed)
dwplot(list(glmer.nb=m1,glmmTMB=m2),effect="fixed") + theme_bw() +
geom_vline(xintercept=0,lty=2)
数据设置
monarch_long <- read.csv("MLMP_2020\ long\ data\ -\ Sheet1.csv")
monarch_long$year <- factor(monarch_long$year)
monarch_long$stage <- factor(monarch_long$stage,
levels=c(paste0("no_",
c("eggs","first","second","third","fourth","fifth")),"adult"))
我读过几篇关于 glmer 收敛问题的帖子,并且尝试了一些推荐的解决方法(更改优化器、更改模型迭代等),但似乎没有任何方法可以解决我的收敛问题。我想知道是否有人可以帮助我找出我做错了什么?
数据说明:该数据为帝王蝶在一年内不同地点历时数年的统计数据。该数据有超过 6000 个观测值,但观测值和站点的数量因年份而异。数据有几个零(没有观察到君主),数据过于分散。
计数:计算了多少人的响应变量
年份:具有 6 个水平的因子
阶段:具有 7 个级别的因素
站点:随机分类变量。列为随机变量,因为每个站点都是一年内的重复
数据汇总:
X siteID site year date stage
Min. : 977 Min. :2538 Groesbeck Field : 581 2015: 817 7/22/2016: 91 adult :976
1st Qu.:2696 1st Qu.:2541 Rain Garden : 490 2016:1258 7/1/2016 : 77 no_eggs :927
Median :4428 Median :2546 Preschool Butterfly Garden: 435 2017: 937 7/29/2016: 77 no_fifth :927
Mean :4419 Mean :2971 Spring Pond : 427 2018:1710 8/12/2016: 77 no_first :927
3rd Qu.:6144 3rd Qu.:2892 Whitetail field : 406 2019:1116 8/25/2016: 77 no_fourth:927
Max. :7808 Max. :6411 Pollinator Mound : 387 2020: 700 (Other) :6090 no_second:927
(Other) :3812 NA's : 49 no_third :927
count
Min. : 0.0000
1st Qu.: 0.0000
Median : 0.0000
Mean : 0.6422
3rd Qu.: 0.0000
Max. :66.0000
head(monarch_long, n=20)
X siteID site year date stage count
1 1505 2541 Groesbeck Field 2018 8/21/2018 no_eggs 66
2 1748 2541 Groesbeck Field 2019 8/12/2019 no_eggs 53
3 1365 2543 Rain Garden 2017 8/7/2017 no_eggs 51
4 1591 2543 Rain Garden 2018 8/21/2018 no_eggs 49
5 1504 2541 Groesbeck Field 2018 8/15/2018 no_eggs 47
6 1469 2546 Butterfly Garden 2018 8/9/2018 no_eggs 46
7 981 2561 Barred Owl Trail 2015 8/5/2015 no_eggs 45
8 2447 2546 Butterfly Garden 2018 8/24/2018 no_first 44
9 3424 2546 Butterfly Garden 2018 8/31/2018 no_second 40
10 1503 2541 Groesbeck Field 2018 8/7/2018 no_eggs 38
11 3423 2546 Butterfly Garden 2018 8/24/2018 no_second 38
12 1021 2541 Groesbeck Field 2015 8/20/2015 no_eggs 37
13 4400 2546 Butterfly Garden 2018 8/31/2018 no_third 33
14 1265 2538 Wetland Restoration 2016 8/25/2016 no_eggs 32
15 1749 2541 Groesbeck Field 2019 8/23/2019 no_eggs 32
16 1470 2546 Butterfly Garden 2018 8/17/2018 no_eggs 31
17 1471 2546 Butterfly Garden 2018 8/24/2018 no_eggs 30
18 1034 2559 Parking Lot Islands 2015 8/11/2015 no_eggs 28
19 1588 2543 Rain Garden 2018 8/2/2018 no_eggs 27
20 6275 2892 Vernal Pool 2017 9/20/2017 no_fifth 27```
代码:
glmer.nb(count~year+stage+(1|site),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)),
data=monarch_long)
Warning message: Warning message: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0123119 (tol = 0.002, component 1)
如果我需要提供更多详细信息,请告诉我。任何建议将不胜感激!
tl;dr 我觉得你的身材真的很好。对于负二项式 GLMM,我现在开始推荐 glmmTMB
而不是 lme4::glmer.nb
。在任何情况下,它都适用于使用完全不同的模型 implementation/algorithm 检查模型参数并确保答案相同,这是解决收敛警告的黄金标准 ...
library(lme4)
system.time(m1 <- glmer.nb(count~year+stage+(1|site),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)),
data=monarch_long))
改装 glmmTMB
(使用 family="nbinom2"
)
library(glmmTMB)
system.time(m2 <- glmmTMB(count~year+stage+(1|site),
family="nbinom2",
data=monarch_long))
## if you have multiple cores you can speed things up further ...
system.time(m3 <- glmmTMB(count~year+stage+(1|site),
family="nbinom2",
control=glmmTMBControl(parallel=5),
data=monarch_long)) ## 10 seconds
比较对数似然(与 logLik()
)表明 glmmTMB
实际上稍微好一点(0.2 对数似然单位),但无论如何,系数估计值非常相似(图形比较仅固定效应,但方差估计也相差 <1%):
library(ggplot2)
library(dotwhisker)
library(broom.mixed)
dwplot(list(glmer.nb=m1,glmmTMB=m2),effect="fixed") + theme_bw() +
geom_vline(xintercept=0,lty=2)
数据设置
monarch_long <- read.csv("MLMP_2020\ long\ data\ -\ Sheet1.csv")
monarch_long$year <- factor(monarch_long$year)
monarch_long$stage <- factor(monarch_long$stage,
levels=c(paste0("no_",
c("eggs","first","second","third","fourth","fifth")),"adult"))