coxph 运行 out of iterations - 不会收敛,分类处理,连续协变量
coxph ran out of iterations - won't converge, categorical treatment, continuous covariates
我正在尝试 运行 Cox 比例风险模型来确定处理和协变量对单个植物物种生存的影响。以前当我 运行 coxph
只用治疗 (categorical/factor)
simacox <- coxph(Surv(Time, Event, type = c('right')) ~ Treatment, data = rsima)
它 运行 很好,但是当我添加(连续)协变量时,我不断收到一条错误消息:
simacox <- coxph(Surv(Time, Event, type = c('right')) ~
Treatment+SLA+VLA+Thickness+Growth_Rate, data = rsima)
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights, :
Ran out of iterations and did not converge
这是数据集:
我不确定它是由 NA 值还是其他问题引起的。我研究过类似的问题,但它们通常会出现,因为 Treatment
是连续的并且似乎是一个不同的问题。
Plot ID Subplot Treatment Column Row Species Time Event Growth_Rate Area SLA VLA Thickness
PC1 1 control A 7 SIMA 535 1 0.0132 NA NA NA NA
PC1 2 control C 2 SIMA 829 0 0.0532 6 123.5312982 1.307927088 0.1005
PC1 3 control D 2 SIMA 535 1 0.0329 NA NA NA NA
PC2 1 control A 7 SIMA 829 0 0.0236 0.75 192.6132404 1.49602026 0.135
PC2 2 control C 2 SIMA 829 1 0.0037 NA NA NA NA
PC2 3 control D 2 SIMA 535 1 0.0099 NA NA NA NA
PC3 1 control A 7 SIMA 152 1 0.0163 NA NA NA NA
PC3 2 control C 2 SIMA 829 0 0.058 1 185.3606789 1.311713087 0.135
PC3 3 control D 2 SIMA 829 0 0.0097 0.75 96.12967467 1.392643765 0.1735
PC4 1 control A 7 SIMA 152 1 0.0109 NA NA NA NA
PC4 2 control C 2 SIMA 120 1 0.0109 NA NA NA NA
PC4 3 control D 2 SIMA 120 1 0.0217 NA NA NA NA
PC5 1 control A 7 SIMA 92 1 0 NA NA NA NA
PC5 2 control C 2 SIMA 152 1 0.0109 NA NA NA NA
PC5 3 control D 2 SIMA 829 1 0.0009 NA NA NA NA
PS1 1 shelter A 7 SIMA 829 0 0.0121 3.25 96.12967467 1.392643765 0.1735
PS1 2 shelter C 2 SIMA 829 1 0.0009 NA NA NA NA
PS1 3 shelter D 2 SIMA 829 0 0.0435 11.75 119.0672131 1.26393576 0.2495
PS2 1 shelter A 7 SIMA 829 0 0.0508 6 128.8442116 1.744927272 0.1417
PS2 2 shelter C 2 SIMA 829 0 0.0193 1 163.722709 1.987793669 0.1045
PS2 3 shelter D 2 SIMA 829 0 0.0484 6.5 134.4099228 1.589451631 0.18
PS3 1 shelter A 7 SIMA 829 0 0.0363 9.5 184.2795579 1.450538059 0.1035
PS3 2 shelter C 2 SIMA 829 0 0.058 11 96.76593176 1.501929992 0.08
PS3 3 shelter D 2 SIMA 829 0 0.0193 2.25 124.317571 3.516426012 0.1295
PS4 1 shelter A 7 SIMA 829 0 0.0411 4.5 113.088867 2.203327018 0.149
PS4 2 shelter C 2 SIMA 535 1 0.0263 NA NA NA NA
PS4 3 shelter D 2 SIMA 829 0 0.058 11 31.44098888 1.714225616 0.1595
PS5 1 shelter A 7 SIMA 829 0 0.0363 11.5 155.3209302 1.308096836 0.23875
PS5 2 shelter C 2 SIMA 829 0 0.0048 0.25 171.0465116 2.135961931 0.104
PS5 3 shelter D 2 SIMA 829 0 0.0266 5 178.9407945 1.599492384 0.0975
PW1 1 watered A 7 SIMA 829 1 0.0056 NA NA NA NA
PW1 2 watered C 2 SIMA 829 0 0.0484 6.5 150.7782165 1.956811087 0.159
PW1 3 watered D 2 SIMA 829 0 0.0181 3 158.1184404 1.94474398 0.1935
PW2 1 watered A 7 SIMA 829 0 0.0351 8.5 148.9020752 1.482003075 0.2405
PW2 2 watered C 2 SIMA 829 0 0.0508 1.5 170.3944295 1.653449107 0.127
PW2 3 watered D 2 SIMA 829 1 0.0009 NA NA NA NA
PW3 1 watered A 7 SIMA 829 0 0.0073 1 159.8682043 1.594187964 0.224
PW3 2 watered C 2 SIMA 120 1 0.0217 NA NA NA NA
PW3 3 watered D 2 SIMA 829 0 0.0919 25 146.6362786 1.694286556 0.1325
PW4 1 watered A 7 SIMA 120 1 0.0109 NA NA NA NA
PW4 2 watered C 2 SIMA 829 1 0.0009 NA NA NA NA
PW4 3 watered D 2 SIMA 152 1 0.0163 NA NA NA NA
PW5 1 watered A 7 SIMA 829 1 0.0009 NA NA NA NA
PW5 2 watered C 2 SIMA 535 1 0.0266 1.5 162.8057554 2.065105317 0.94
PW5 3 watered D 2 SIMA 829 0 0.058 4 80.37696758 1.831219479 0.1195
问题
问题其实出在Thickness
;很容易验证
fit <- coxph(Surv(Time, Event) ~ Thickness, data = rsima)
产生警告
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights, :
Ran out of iterations and did not converge
我们可以从 ?coxph
:
中获得一些关于收敛问题的见解
In certain data cases the actual MLE estimate of a coefficient is infinity, e.g., a dichotomous variable where one of the groups has no events. When this happens the associated coefficient grows at a steady pace and a race condition will exist in the fitting routine: either the log likelihood converges, the information matrix becomes effectively singular, an argument to exp becomes too large for the computer hardware, or the maximum number of interactions is exceeded. (Nearly always the first occurs.) The routine attempts to detect when this has happened, not always successfully. The primary consequence for he user is that the Wald statistic = coefficient/se(coefficient) is not valid in this case and should be ignored; the likelihood ratio and score tests remain valid however.
解释
如果我们看一下 rsima$Thickness
,我们会注意到大多数值都很小(在 0.08 <= Thickness <= 0.2495
范围内),其中一个值为 Thickness = 0.94
。这与文档中描述的情况非常相似,其中 Thickness
基本上是一个离散变量(水平 "low" 和 "high")并且一组几乎没有事件("high"组只有一个事件)。
在this post on Cross Validated的基础上,通过绘图
来可视化效果很有用
library(survminer)
ggsurvplot(survfit(Surv(Time, Event) ~ (Thickness > median(Thickness, na.rm = T)), data = df), data = df)
我们在这里做的是将生存概率绘制为 二分法 Thickness
的函数,其中 Thickness
小于中位数值(红色曲线)或更大(蓝色曲线)。
您可以看到 Thickness
对生存概率的影响,或者更确切地说,没有 Thickness
的影响。例如,请注意小 Thickness
值没有 Event = 1
个案例,而大 Thickness
值只有一个 Event = 1
个案例。
在拟合模型方面,不可能获得 Thickness
对生存概率影响的稳健估计,在探索其他 Thickness
之前应从模型中删除 Thickness
=59=] 协变量。
我正在尝试 运行 Cox 比例风险模型来确定处理和协变量对单个植物物种生存的影响。以前当我 运行 coxph
只用治疗 (categorical/factor)
simacox <- coxph(Surv(Time, Event, type = c('right')) ~ Treatment, data = rsima)
它 运行 很好,但是当我添加(连续)协变量时,我不断收到一条错误消息:
simacox <- coxph(Surv(Time, Event, type = c('right')) ~
Treatment+SLA+VLA+Thickness+Growth_Rate, data = rsima)
Warning message: In fitter(X, Y, strats, offset, init, control, weights = weights, : Ran out of iterations and did not converge
这是数据集:
我不确定它是由 NA 值还是其他问题引起的。我研究过类似的问题,但它们通常会出现,因为 Treatment
是连续的并且似乎是一个不同的问题。
Plot ID Subplot Treatment Column Row Species Time Event Growth_Rate Area SLA VLA Thickness
PC1 1 control A 7 SIMA 535 1 0.0132 NA NA NA NA
PC1 2 control C 2 SIMA 829 0 0.0532 6 123.5312982 1.307927088 0.1005
PC1 3 control D 2 SIMA 535 1 0.0329 NA NA NA NA
PC2 1 control A 7 SIMA 829 0 0.0236 0.75 192.6132404 1.49602026 0.135
PC2 2 control C 2 SIMA 829 1 0.0037 NA NA NA NA
PC2 3 control D 2 SIMA 535 1 0.0099 NA NA NA NA
PC3 1 control A 7 SIMA 152 1 0.0163 NA NA NA NA
PC3 2 control C 2 SIMA 829 0 0.058 1 185.3606789 1.311713087 0.135
PC3 3 control D 2 SIMA 829 0 0.0097 0.75 96.12967467 1.392643765 0.1735
PC4 1 control A 7 SIMA 152 1 0.0109 NA NA NA NA
PC4 2 control C 2 SIMA 120 1 0.0109 NA NA NA NA
PC4 3 control D 2 SIMA 120 1 0.0217 NA NA NA NA
PC5 1 control A 7 SIMA 92 1 0 NA NA NA NA
PC5 2 control C 2 SIMA 152 1 0.0109 NA NA NA NA
PC5 3 control D 2 SIMA 829 1 0.0009 NA NA NA NA
PS1 1 shelter A 7 SIMA 829 0 0.0121 3.25 96.12967467 1.392643765 0.1735
PS1 2 shelter C 2 SIMA 829 1 0.0009 NA NA NA NA
PS1 3 shelter D 2 SIMA 829 0 0.0435 11.75 119.0672131 1.26393576 0.2495
PS2 1 shelter A 7 SIMA 829 0 0.0508 6 128.8442116 1.744927272 0.1417
PS2 2 shelter C 2 SIMA 829 0 0.0193 1 163.722709 1.987793669 0.1045
PS2 3 shelter D 2 SIMA 829 0 0.0484 6.5 134.4099228 1.589451631 0.18
PS3 1 shelter A 7 SIMA 829 0 0.0363 9.5 184.2795579 1.450538059 0.1035
PS3 2 shelter C 2 SIMA 829 0 0.058 11 96.76593176 1.501929992 0.08
PS3 3 shelter D 2 SIMA 829 0 0.0193 2.25 124.317571 3.516426012 0.1295
PS4 1 shelter A 7 SIMA 829 0 0.0411 4.5 113.088867 2.203327018 0.149
PS4 2 shelter C 2 SIMA 535 1 0.0263 NA NA NA NA
PS4 3 shelter D 2 SIMA 829 0 0.058 11 31.44098888 1.714225616 0.1595
PS5 1 shelter A 7 SIMA 829 0 0.0363 11.5 155.3209302 1.308096836 0.23875
PS5 2 shelter C 2 SIMA 829 0 0.0048 0.25 171.0465116 2.135961931 0.104
PS5 3 shelter D 2 SIMA 829 0 0.0266 5 178.9407945 1.599492384 0.0975
PW1 1 watered A 7 SIMA 829 1 0.0056 NA NA NA NA
PW1 2 watered C 2 SIMA 829 0 0.0484 6.5 150.7782165 1.956811087 0.159
PW1 3 watered D 2 SIMA 829 0 0.0181 3 158.1184404 1.94474398 0.1935
PW2 1 watered A 7 SIMA 829 0 0.0351 8.5 148.9020752 1.482003075 0.2405
PW2 2 watered C 2 SIMA 829 0 0.0508 1.5 170.3944295 1.653449107 0.127
PW2 3 watered D 2 SIMA 829 1 0.0009 NA NA NA NA
PW3 1 watered A 7 SIMA 829 0 0.0073 1 159.8682043 1.594187964 0.224
PW3 2 watered C 2 SIMA 120 1 0.0217 NA NA NA NA
PW3 3 watered D 2 SIMA 829 0 0.0919 25 146.6362786 1.694286556 0.1325
PW4 1 watered A 7 SIMA 120 1 0.0109 NA NA NA NA
PW4 2 watered C 2 SIMA 829 1 0.0009 NA NA NA NA
PW4 3 watered D 2 SIMA 152 1 0.0163 NA NA NA NA
PW5 1 watered A 7 SIMA 829 1 0.0009 NA NA NA NA
PW5 2 watered C 2 SIMA 535 1 0.0266 1.5 162.8057554 2.065105317 0.94
PW5 3 watered D 2 SIMA 829 0 0.058 4 80.37696758 1.831219479 0.1195
问题
问题其实出在Thickness
;很容易验证
fit <- coxph(Surv(Time, Event) ~ Thickness, data = rsima)
产生警告
Warning message: In fitter(X, Y, strats, offset, init, control, weights = weights, : Ran out of iterations and did not converge
我们可以从 ?coxph
:
In certain data cases the actual MLE estimate of a coefficient is infinity, e.g., a dichotomous variable where one of the groups has no events. When this happens the associated coefficient grows at a steady pace and a race condition will exist in the fitting routine: either the log likelihood converges, the information matrix becomes effectively singular, an argument to exp becomes too large for the computer hardware, or the maximum number of interactions is exceeded. (Nearly always the first occurs.) The routine attempts to detect when this has happened, not always successfully. The primary consequence for he user is that the Wald statistic = coefficient/se(coefficient) is not valid in this case and should be ignored; the likelihood ratio and score tests remain valid however.
解释
如果我们看一下 rsima$Thickness
,我们会注意到大多数值都很小(在 0.08 <= Thickness <= 0.2495
范围内),其中一个值为 Thickness = 0.94
。这与文档中描述的情况非常相似,其中 Thickness
基本上是一个离散变量(水平 "low" 和 "high")并且一组几乎没有事件("high"组只有一个事件)。
在this post on Cross Validated的基础上,通过绘图
来可视化效果很有用library(survminer)
ggsurvplot(survfit(Surv(Time, Event) ~ (Thickness > median(Thickness, na.rm = T)), data = df), data = df)
我们在这里做的是将生存概率绘制为 二分法 Thickness
的函数,其中 Thickness
小于中位数值(红色曲线)或更大(蓝色曲线)。
您可以看到 Thickness
对生存概率的影响,或者更确切地说,没有 Thickness
的影响。例如,请注意小 Thickness
值没有 Event = 1
个案例,而大 Thickness
值只有一个 Event = 1
个案例。
在拟合模型方面,不可能获得 Thickness
对生存概率影响的稳健估计,在探索其他 Thickness
之前应从模型中删除 Thickness
=59=] 协变量。