来自 "survival" 包的 survfit 函数错误:"Error in strata(mf[ll]) : all arguments must be the same length"

Error in survfit function from "survival" package: "Error in strata(mf[ll]) : all arguments must be the same length"

我有模拟数据,我想得到 Kaplan Meier 估计值。

我有 500 个观察结果, Y = 观察时间,Delta = 状态 x_bin = 二元处理变量和 x_cont= 2 个连续变量

Y = c(23.076677,33.856999,0.587694,44.213549,9.027398,6.466811,
      13.053572,  2.846332,30.895564 ,4.062382,16.524918 , 6.220079, 17.547434, 4.529800,
      30.683129 ,25.443511 ,16.130519, 21.840292, 19.827314, 23.840220,22.518815,  9.873650, 
      6.585225 ,24.485416 ,5.811711 ,19.230248 , 6.344504 ,27.159498,  8.615084 , 7.899020, 
      9.224606, 43.429181 ,19.130139, 43.180901, 13.239691, 21.946553,29.469361,
      13.792664,1.706786, 11.230684, 12.433856,  9.284416 ,36.884566 ,11.456953, 16.747181 ,21.003923 ,
      41.090373, 18.944196, 31.675754 ,34.103413 ,19.433604 ,40.876068, 17.530126, 25.250155,
      6.896457, 29.314967 ,6.465073,46.352824,15.591029,19.635961,24.107908,9.227189,14.164096,0.059026,
      37.723229, 50.015481, 28.065238,30.262120 ,61.420504, 14.084382, 25.982968,15.213584,15.505326,
      0.653056, 13.018299,1.575673, 18.589050, 0.000299,16.982758,10.798754,8.100633 ,5.362828 ,
      0.453016 , 3.755654, 21.089715, 57.229954,0.141664, 25.948761, 87.196375 , 6.240832, 39.569735, 
      79.732973 ,17.317158 ,15.658974, 20.406179,18.944196, 31.675754 ,34.103413 ,19.43360,20.58367)

delta = rbinom(100,1,0.3)
x_cont= matrix(rnorm(200,0.2,0.25),100,2)
x_cont= as.matrix(data.frame(x_cont))
treated= factor(rbinom(100,1,0.5))
km_model=survfit(Surv(Y,delta) ~ treated +x_cont)

错误信息:

Error in strata(mf[ll]): all arguments must be the same length

如果我只取一个连续变量那么它可以工作,但如果我取多个连续变量就会给我这个错误。

TIA

我们可能需要 data 作为 data.frame。根据?survfit.formula

data - a data frame in which to interpret the variables named in the formula, subset and weights arguments.

out <- survfit(Surv(Y,delta) ~ treated +X1 + X2,
      data = data.frame(x_cont, treated))

-检查输出

> str(out)
List of 17
 $ n        : int [1:100] 1 1 1 1 1 1 1 1 1 1 ...
 $ time     : num [1:100] 15.214 19.23 14.084 0.142 5.812 ...
 $ n.risk   : num [1:100] 1 1 1 1 1 1 1 1 1 1 ...
 $ n.event  : num [1:100] 1 1 0 1 0 0 0 0 0 1 ...
 $ n.censor : num [1:100] 0 0 1 0 1 1 1 1 1 0 ...
 $ surv     : num [1:100] 0 0 1 0 1 1 1 1 1 0 ...
 $ std.err  : num [1:100] Inf Inf 0 Inf 0 ...
 $ cumhaz   : num [1:100] 1 1 0 1 0 0 0 0 0 1 ...
 $ std.chaz : num [1:100] 1 1 0 1 0 0 0 0 0 1 ...
 $ strata   : Named int [1:100] 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "names")= chr [1:100] "treated=0, X1=-0.297081842241384  , X2=0.105548785177934  " "treated=0, X1=-0.296945987489856  , X2=0.180710460866624  " "treated=0, X1=-0.252387685923698  , X2=-0.0657794268622904" "treated=0, X1=-0.242585454493932  , X2=0.69364514503322   " ...
 $ type     : chr "right"
 $ logse    : logi TRUE
 $ conf.int : num 0.95
 $ conf.type: chr "log"
 $ lower    : num [1:100] NA NA 1 NA 1 1 1 1 1 NA ...
 $ upper    : num [1:100] NA NA 1 NA 1 1 1 1 1 NA ...
 $ call     : language survfit(formula = Surv(Y, delta) ~ treated + X1 + X2, data = data.frame(x_cont, treated))
 - attr(*, "class")= chr "survfit"

更新

如果我们有更多列,请考虑使用 paste

创建表达式
dat1 <- data.frame(x_cont, treated)
expr1 <- as.formula(paste("Surv(Y,delta) ~ ", paste(names(dat1), collapse="+")))
out1 <- survfit(expr1, data = dat1)
 out1$call <- deparse(expr1)

-检查结构

> str(out1)
List of 17
 $ n        : int [1:100] 1 1 1 1 1 1 1 1 1 1 ...
 $ time     : num [1:100] 15.214 19.23 21.09 14.084 0.142 ...
 $ n.risk   : num [1:100] 1 1 1 1 1 1 1 1 1 1 ...
 $ n.event  : num [1:100] 1 1 0 0 1 1 0 0 1 0 ...
 $ n.censor : num [1:100] 0 0 1 1 0 0 1 1 0 1 ...
 $ surv     : num [1:100] 0 0 1 1 0 0 1 1 0 1 ...
 $ std.err  : num [1:100] Inf Inf 0 0 Inf ...
 $ cumhaz   : num [1:100] 1 1 0 0 1 1 0 0 1 0 ...
 $ std.chaz : num [1:100] 1 1 0 0 1 1 0 0 1 0 ...
 $ strata   : Named int [1:100] 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "names")= chr [1:100] "X1=-0.297081842241384, X2=0.105548785177934  , treated=0" "X1=-0.296945987489856, X2=0.180710460866624  , treated=0" "X1=-0.292560489804895, X2=0.374632415815526  , treated=1" "X1=-0.252387685923698, X2=-0.0657794268622904, treated=0" ...
 $ type     : chr "right"
 $ logse    : logi TRUE
 $ conf.int : num 0.95
 $ conf.type: chr "log"
 $ lower    : num [1:100] NA NA 1 1 NA NA 1 1 NA 1 ...
 $ upper    : num [1:100] NA NA 1 1 NA NA 1 1 NA 1 ...
 $ call     : chr "Surv(Y, delta) ~ X1 + X2 + treated"
 - attr(*, "class")= chr "survfit"