(R lme4) eval(substitute(expr), envir, enclos) 错误:(maxstephalfit) PIRLS step-halvings 未能减少 pwrssUpdate 中的偏差

(R lme4) Error in eval(substitute(expr), envir, enclos) : (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

我正在 运行 使用包 lme4.

中的 glmer() 函数建立具有随机效应的广义线性模型

模型代码如下所示:

mod6 <- glmer((Ndifference+74337) ~ netm1011 + d1011 + 
           b0001 + (1|region), Family = Gamma(link = "identity"))

Ndifference是美国50个州(和DC)200年到2010年的人口差异统计数据。有一个负值(密歇根州为 -74336),所以我添加了一个常数以确保我的回复都是正面的。

所有预测变量(区域随机效应除外)都是比例或百分比。 Netm1011(2010 年移民到各州的比率)和 d1011(每 1000 人的死亡率)都有几个负值。 B0001 包含所有正比例(出生 rate/1000 人)。

当我 运行 模型时,我不断收到此错误:

Error in eval(substitute(expr), envir, enclos) : 
(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

在这个错误之前有大量的输出,看起来像这样::

我也尝试过不同系列的发行版 (Gammainverse.gaussian)。这个错误代码到底是什么意思?

下面是我一直在使用的数据(只是涉及的变量)。如有任何帮助,我们将不胜感激!

structure(list(Ndifference = c(333269L, 86445L, 1245536L, 244720L, 
3333964L, 725062L, 166537L, 113590L, 33923L, 2791639L, 1484925L, 
151993L, 271526L, 404489L, 399906L, 122812L, 167087L, 299231L, 
74155L, 50624L, 477090L, 206187L, -74336L, 379644L, 120037L, 
392539L, 87578L, 117119L, 685035L, 76962L, 374381L, 243485L, 
409910L, 1481681L, 33444L, 178335L, 306232L, 408919L, 428717L, 
2533L, 612397L, 60714L, 654646L, 4296180L, 533538L, 16207L, 921089L, 
835264L, 46920L, 317348L, 70366L), d1011 = c(0.01009935290407, 
0.00482181820219, 0.00740624039708, 0.00988384227183, 0.00640323497813, 
0.00628209882119, 0.00812947210436, 0.00872837354861, 0.00764311417232, 
0.00915166624244, 0.00737517844004, 0.00718037578961, 0.00746442540795, 
0.00794410854005, 0.00889218497298, 0.00923607712106, 0.00850800517833,   
0.00983998039872, 0.00904860671746, 0.00978543746728, 0.00752488166029, 
0.00814412474047, 0.008998680863, 0.0074466124005, 0.00971662809766, 
0.00917030625948, 0.00880861178822, 0.00819753663997, 0.00718370505053, 
0.00796602176569, 0.00789025770533, 0.00777472712417, 0.00769648628539, 
0.00831202019281, 0.00850432185633, 0.00953304172455, 0.00962020831593, 
0.0084093843696, 0.00992588646267, 0.00893168396866, 0.00908595754594, 
0.00854178331167, 0.00947807131183, 0.00662702930588, 0.0053663066427, 
0.00848516414343, 0.00741560390799, 0.00724357008593, 0.01174960990152, 
0.00835051236548, 0.00772546941972), netm1011 = c(0.00109618827436, 
0.00189063449694, 0.00284535027555, 0.00218869215636, 0.00200262974559, 
0.0065388101588, 0.00074903204593, 0.00531214993154, 0.01546474398708, 
0.01046605886554, 0.00346226170766, 0.0039720759906, 0.00110199747387, 
-0.00340610876916, -4.63800737643485e-05, 0.00143230827182, -0.0018378102704, 
0.00157293366968, 0.00169295518939, 0.00086246831653, 0.00396682929054, 
0.00395032406919, -0.00265224491201, 0.00162162050201, -0.0011606066005, 
-0.00128783881235, 0.00364476878277, 0.00043559148624, -0.00024040613102, 
-0.00066598675772, -8.70119549428016e-05, 0.00073131738351, 0.0004310477698, 
0.00519235806746, 0.00995606223948, -0.00192862200551, 0.00257535479622, 
0.00452502363079, 0.00132008444764, -0.0033720597776, 0.00464986350895, 
0.00318540398886, 0.0036471909126, 0.00699275905022, 0.00104501002309, 
7.98829235871906e-05, 0.00428168852619, 0.00637386122264, 0.00108682812851, 
-0.00029879124879, -6.91039695800782e-05), b0001 = c(0.01800092688004, 
0.02011469070317, 0.02028566151573, 0.0179124206973, 0.01941521590629, 
0.01846852368848, 0.01564274610647, 0.01763088342656, 0.01782806190528, 
0.01595252071591, 0.02045645453128, 0.01934534979926, 0.01892079941012, 
0.01859074925398, 0.01759062061265, 0.01593436604294, 0.01809677718956, 
0.01698907749719, 0.01956008653302, 0.01292521854622, 0.01781296155008, 
0.01589757045382, 0.01700943508274, 0.01673888527351, 0.01999578814362, 
0.01689588730579, 0.01474826635901, 0.01762957227617, 0.01844433337313, 
0.01426185254875, 0.01647358935637, 0.01852101980912, 0.01705020482026, 
0.01867477359887, 0.01474757340631, 0.01722894148788, 0.01746963005864, 
0.01632960496522, 0.01466473971168, 0.01463956672595, 0.01772861915606, 
0.01702957873434, 0.01740538934663, 0.02136003322368, 0.02565897334663, 
0.01291107725161, 0.01753092898439, 0.01687893043972, 0.01409828681218, 
0.01588293753652, 0.01540482711573), region = structure(c(3L, 
4L, 4L, 3L, 4L, 4L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 1L, 1L, 1L, 1L, 
3L, 3L, 2L, 3L, 2L, 1L, 1L, 3L, 1L, 4L, 1L, 4L, 2L, 2L, 4L, 2L, 
3L, 1L, 1L, 3L, 4L, 2L, 2L, 3L, 1L, 3L, 3L, 4L, 2L, 3L, 4L, 3L, 
1L, 4L), .Label = c("MW", "NE", "SE", "W"), class = "factor")), .Names =              c("Ndifference", 
"d1011", "netm1011", "b0001", "region"), class = "data.frame", row.names =    c(NA, 
-51L))

您是否有特殊原因使用身份为 link (family=Gamma(link="identity")) 的 Gamma?这可能是您问题的近端原因。使用带有日志 link (family(Gamma(link="log"))) 的 Gamma 似乎工作正常。 (通常情况下,拟合对数正态模型,即对响应变量进行对数变换并使用 lmer 而不是 glmer,速度更快且更稳健。)

一般来说,使用 link 不将响应限制在所选族域内的函数在数值上是不稳定的,因为模型很容易进入不可行的区域。将其从统计术语中翻译出来,如果您要假设数据是 Gamma 分布的,那么如果在获得最佳解决方案的过程中,程序会尝试导致负均值估计的值,那么您可能会遇到麻烦。与恒等式 link(或 Gamma 族的规范逆 link)相反,对数 link 强制所有预测为正。

诚然,

lme4 比它可能更脆弱,但我看不出有什么特别的理由可以假设您的回答是 Gamma 分布的,身份为 link ...

根据我从模型预测和诊断中看到的情况,可能值得从您的数据集中删除负值 - 强制它(勉强)为正值会使它成为对数尺度 ...

library(lme4)
m1 <- lmer(log(Ndifference+74337) ~ netm1011 + d1011 + 
           b0001 + (1|region), data=dd)

m2 <- glmer(Ndifference+74337 ~ netm1011 + d1011 + 
           b0001 + (1|region), data=dd,
     family=Gamma(link="log"))

比较 pred 和 obs(忽略异常值):

pairs(cbind(obs=log(dd$Ndifference+74337),
            lognorm=predict(m1),gamma=predict(m2)),gap=0,
      xlim=c(10,15),ylim=c(10,15))