为什么 flexsurvreg 在这个痛苦的简单案例中失败了?
Why is flexsurvreg failing in this painfully simple case?
看看这个痛苦的简单案例和错误。内联评论。
library(flexsurv)
#> Loading required package: survival
library(tidyverse)
library(magrittr)
#>
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#>
#> set_names
#> The following object is masked from 'package:tidyr':
#>
#> extract
set.seed(2019)
train_data <- tribble(
~wait_time, ~called_yet, ~time_queued,
131.282999992371, 0, 1570733365.28,
358.296000003815, 1, 1570733421.187,
1352.13999986649, 1, 1570733540.923,
1761.61400008202, 0, 1570733941.343,
1208.25300002098, 0, 1570734327.11,
522.296999931335, 1, 1570734376.953,
241.75, 0, 1570734659.44,
143.156999826431, 0, 1570734809.673,
1202.79999995232, 1, 1570734942.907,
614.640000104904, 1, 1570735526.567
)
# Base survival works fine!
survival_model <- survreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
survival_model
#> Call:
#> survreg(formula = Surv(wait_time, called_yet) ~ time_queued,
#> data = train_data, dist = "weibull")
#>
#> Coefficients:
#> (Intercept) time_queued
#> 4.533765e+05 -2.886352e-04
#>
#> Scale= 0.518221
#>
#> Loglik(model)= -40.2 Loglik(intercept only)= -40.5
#> Chisq= 0.5 on 1 degrees of freedom, p= 0.48
#> n= 10
# flexsurvreg can't even get a valid initializer for time_queued, even though
# the doc says it takes the mean of the data
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
#> Error in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data = train_data, : Initial value for parameter 2 out of range
# Maybe the low variance of the predictor here is the problem? So let's up the
# variance just to see
train_data %<>% mutate_at("time_queued", subtract, 1.57073e9)
train_data
#> # A tibble: 10 x 3
#> wait_time called_yet time_queued
#> <dbl> <dbl> <dbl>
#> 1 131. 0 3365.
#> 2 358. 1 3421.
#> 3 1352. 1 3541.
#> 4 1762. 0 3941.
#> 5 1208. 0 4327.
#> 6 522. 1 4377.
#> 7 242. 0 4659.
#> 8 143. 0 4810.
#> 9 1203. 1 4943.
#> 10 615. 1 5527.
# Now it initializes, so that's different... but now it won't converge!
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
#> Warning in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data
#> = train_data, : Optimisation has probably not converged to the maximum
#> likelihood - Hessian is not positive definite.
由 reprex package (v0.3.0)
于 2019-10-19 创建
我主要想使用 flexsurv
以获得更好的绘图选项和更标准的形状和比例定义 - 辅助参数也非常有吸引力 - 但现在我主要只是想知道我是否在做确实有问题,flexsurv
试图告诉我不要相信我的基础 survival
模型。
Marco Sandri 指出重新居中解决了这个问题;然而,不重新缩放的重新定心只能保证初始化,如果方差非常大,仍然不会收敛。我认为这是一个错误,因为 survival
对于具有完全相同值的完全相同模型没有问题。创建了一个问题 here。
看看这个痛苦的简单案例和错误。内联评论。
library(flexsurv)
#> Loading required package: survival
library(tidyverse)
library(magrittr)
#>
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#>
#> set_names
#> The following object is masked from 'package:tidyr':
#>
#> extract
set.seed(2019)
train_data <- tribble(
~wait_time, ~called_yet, ~time_queued,
131.282999992371, 0, 1570733365.28,
358.296000003815, 1, 1570733421.187,
1352.13999986649, 1, 1570733540.923,
1761.61400008202, 0, 1570733941.343,
1208.25300002098, 0, 1570734327.11,
522.296999931335, 1, 1570734376.953,
241.75, 0, 1570734659.44,
143.156999826431, 0, 1570734809.673,
1202.79999995232, 1, 1570734942.907,
614.640000104904, 1, 1570735526.567
)
# Base survival works fine!
survival_model <- survreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
survival_model
#> Call:
#> survreg(formula = Surv(wait_time, called_yet) ~ time_queued,
#> data = train_data, dist = "weibull")
#>
#> Coefficients:
#> (Intercept) time_queued
#> 4.533765e+05 -2.886352e-04
#>
#> Scale= 0.518221
#>
#> Loglik(model)= -40.2 Loglik(intercept only)= -40.5
#> Chisq= 0.5 on 1 degrees of freedom, p= 0.48
#> n= 10
# flexsurvreg can't even get a valid initializer for time_queued, even though
# the doc says it takes the mean of the data
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
#> Error in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data = train_data, : Initial value for parameter 2 out of range
# Maybe the low variance of the predictor here is the problem? So let's up the
# variance just to see
train_data %<>% mutate_at("time_queued", subtract, 1.57073e9)
train_data
#> # A tibble: 10 x 3
#> wait_time called_yet time_queued
#> <dbl> <dbl> <dbl>
#> 1 131. 0 3365.
#> 2 358. 1 3421.
#> 3 1352. 1 3541.
#> 4 1762. 0 3941.
#> 5 1208. 0 4327.
#> 6 522. 1 4377.
#> 7 242. 0 4659.
#> 8 143. 0 4810.
#> 9 1203. 1 4943.
#> 10 615. 1 5527.
# Now it initializes, so that's different... but now it won't converge!
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
data = train_data,
dist = "weibull")
#> Warning in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data
#> = train_data, : Optimisation has probably not converged to the maximum
#> likelihood - Hessian is not positive definite.
由 reprex package (v0.3.0)
于 2019-10-19 创建我主要想使用 flexsurv
以获得更好的绘图选项和更标准的形状和比例定义 - 辅助参数也非常有吸引力 - 但现在我主要只是想知道我是否在做确实有问题,flexsurv
试图告诉我不要相信我的基础 survival
模型。
Marco Sandri 指出重新居中解决了这个问题;然而,不重新缩放的重新定心只能保证初始化,如果方差非常大,仍然不会收敛。我认为这是一个错误,因为 survival
对于具有完全相同值的完全相同模型没有问题。创建了一个问题 here。