如何从 predict.ranger, R 的预测输出中读取索引

Question

使用 ranger 包我运行以下脚本：

 rf <- ranger(Surv(time, Y) ~ ., data = train_frame[1:50000, ], write.forest = TRUE, num.trees = 100)

test_frame <-  train_frame[50001:100000, ]
preds <- predict(rf, test_frame)
chfs <- preds$chf
plot(chfs[1, ])

累积风险函数在 X 轴上具有索引 1 - 36。显然这与时间相对应，但我不确定如何：我的观察变量时间范围从最小值 0 到最大值 399。原始数据与 predict.ranger 的预测输出之间的映射是什么，在给定的时间长度后，我如何操作它来量化给定主题的风险程度？

这是我的 time/event 数据的示例：

       Y  time
   <int> <dbl>
1      1   358
2      0    90
3      0   162
4      0    35
5      0   307
6      0    69
7      0   184
8      0    24
9      0   366
10     0    33

这是第一个主题的瑞士法郎的样子：谁能帮我把这些点联系起来？ "matrix" 对象 preds$chf.

上没有行名或列名

Answer 1

在预测对象中有一个名为 unique.death.times 的向量，其中包含计算 CHF 和生存估计的时间点。 chf 矩阵在行中有观察值，在列中有这些时间点，与 survival.

相同

可重现的例子：

library(survival)
library(ranger)

## Split the data
n <- nrow(veteran)
idx <- sample(n, 2/3*n)
train <- veteran[idx, ]
test <- veteran[-idx, ]

## Grow RF and predict
rf <- ranger(Surv(time, status) ~ ., train, write.forest = TRUE)
preds <- predict(rf, test)

## Example CHF plot
plot(preds$unique.death.times, preds$chf[1, ])

## Example survival plot
plot(preds$unique.death.times, preds$survival[1, ])

为生存林设置 importance = "impurity" 应该会引发错误。

如何从 predict.ranger, R 的预测输出中读取索引

How to read the indexes from the prediction output of predict.ranger, R

r

random-forest