R mlr surv.ranger `[.data.frame`(num.response, x == y) 中的错误:选择了未定义的列

R mlr surv.ranger Error in `[.data.frame`(num.response, x == y) : undefined columns selected

准备数据("ovarian" 来自生存包):

require(pacman)
p_load(mlr, survival, tidyverse, ranger)
data("ovarian")
ovarian$rx <- factor(ovarian$rx, 
                     levels = c("1", "2"), 
                     labels = c("A", "B"))
ovarian$resid.ds <- factor(ovarian$resid.ds, 
                           levels = c("1", "2"), 
                           labels = c("no", "yes"))
ovarian$ecog.ps <- factor(ovarian$ecog.ps, 
                          levels = c("1", "2"), 
                          labels = c("good", "bad"))
ovarian <- ovarian %>% mutate(age_group = ifelse(age >=50, "old", "young"))
ovarian$age_group <- factor(ovarian$age_group)

现在,运行 包裹 'mlr',surv.ranger:

trainTask <- makeSurvTask(data = ovarian, target = c("futime", "fustat"))
trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,trainTask)
Error in `[.data.frame`(num.response, x == y) : 
  undefined columns selected

为什么会出现错误?如何解决?

然后我尝试使用另一个示例数据集(来自 mlr 包的 "lung.task"),但出现另一个错误:

trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,lung.task) # lung.task is from mlr package
Error in ranger::ranger(formula = NULL, dependent.variable.name = tn[1L],  : 
  argument ".weights" is missing, with no default

找了好久才弄明白,现在报错了。它来自包 ranger 中的参数 respect.unordered.factors,这也不起作用:

ranger::ranger(formula = NULL, dependent.variable.name = "futime", status.variable.name = "fustat", data = ovarian, respect.unordered.factors = "order")

暂时解决这个问题你可以将它设置为另一个值:

lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "partition")
lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "order")

编辑:在 github 的最新版本中,此错误不再出现。要安装它,请使用以下命令并重新启动 R:

devtools::install_github("imbs-hl/ranger")

另见此处:https://github.com/imbs-hl/ranger/issues/359