为什么 flexsurvreg 在这个痛苦的简单案例中失败了?

Why is flexsurvreg failing in this painfully simple case?

看看这个痛苦的简单案例和错误。内联评论。

library(flexsurv)
#> Loading required package: survival
library(tidyverse)
library(magrittr)
#> 
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#> 
#>     set_names
#> The following object is masked from 'package:tidyr':
#> 
#>     extract

set.seed(2019)

train_data <- tribble(
  ~wait_time,        ~called_yet, ~time_queued,
   131.282999992371, 0,           1570733365.28,
   358.296000003815, 1,           1570733421.187,
  1352.13999986649,  1,           1570733540.923,
  1761.61400008202,  0,           1570733941.343,
  1208.25300002098,  0,           1570734327.11,
   522.296999931335, 1,           1570734376.953,
   241.75,           0,           1570734659.44,
   143.156999826431, 0,           1570734809.673,
  1202.79999995232,  1,           1570734942.907,
   614.640000104904, 1,           1570735526.567
)

# Base survival works fine!
survival_model <- survreg(Surv(wait_time, called_yet) ~ time_queued, 
                          data = train_data,
                          dist = "weibull")

survival_model
#> Call:
#> survreg(formula = Surv(wait_time, called_yet) ~ time_queued, 
#>     data = train_data, dist = "weibull")
#> 
#> Coefficients:
#>   (Intercept)   time_queued 
#>  4.533765e+05 -2.886352e-04 
#> 
#> Scale= 0.518221 
#> 
#> Loglik(model)= -40.2   Loglik(intercept only)= -40.5
#>  Chisq= 0.5 on 1 degrees of freedom, p= 0.48 
#> n= 10

# flexsurvreg can't even get a valid initializer for time_queued, even though
# the doc says it takes the mean of the data
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
                              data = train_data,
                              dist = "weibull")
#> Error in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data = train_data, : Initial value for parameter 2 out of range

# Maybe the low variance of the predictor here is the problem? So let's up the
# variance just to see
train_data %<>% mutate_at("time_queued", subtract, 1.57073e9)

train_data
#> # A tibble: 10 x 3
#>    wait_time called_yet time_queued
#>        <dbl>      <dbl>       <dbl>
#>  1      131.          0       3365.
#>  2      358.          1       3421.
#>  3     1352.          1       3541.
#>  4     1762.          0       3941.
#>  5     1208.          0       4327.
#>  6      522.          1       4377.
#>  7      242.          0       4659.
#>  8      143.          0       4810.
#>  9     1203.          1       4943.
#> 10      615.          1       5527.

# Now it initializes, so that's different... but now it won't converge!
flexsurv_model <- flexsurvreg(Surv(wait_time, called_yet) ~ time_queued,
                              data = train_data,
                              dist = "weibull")
#> Warning in flexsurvreg(Surv(wait_time, called_yet) ~ time_queued, data
#> = train_data, : Optimisation has probably not converged to the maximum
#> likelihood - Hessian is not positive definite.

reprex package (v0.3.0)

于 2019-10-19 创建

我主要想使用 flexsurv 以获得更好的绘图选项和更标准的形状和比例定义 - 辅助参数也非常有吸引力 - 但现在我主要只是想知道我是否在做确实有问题,flexsurv 试图告诉我不要相信我的基础 survival 模型。

Marco Sandri 指出重新居中解决了这个问题;然而,不重新缩放的重新定心只能保证初始化,如果方差非常大,仍然不会收敛。我认为这是一个错误,因为 survival 对于具有完全相同值的完全相同模型没有问题。创建了一个问题 here