MLR - 考克斯模型的重采样
MLR - Resampling of cox model
如果我在 mlr 中训练单个 Cox PH 模型,我可以打印一个摘要,显示每个预测变量的统计显着性,如下所示。但是,如果我使用重采样,例如 5 折交叉验证,是否有任何方法可以获取此信息,可能是在 5 次迭代中聚合,或者甚至只是针对每次迭代分别进行?
训练单个模型:
surv.task <- makeSurvTask(data = bb_imp, target = c("timeToEvent", "status"))
surv.lrn <- makeLearner(cl="surv.coxph", predict.type="response")
blood_coxph_base <- train(surv.lrn, surv.task)
mod <- getLearnerModel(blood_coxph_base)
summary(mod)
Call:
survival::coxph(formula = f, data = data)
n= 873, number of events= 82
coef exp(coef) se(coef) z Pr(>|z|)
Glucose 1.336e-01 1.143e+00 1.320e-01 1.012 0.311516
Urate -1.293e+00 2.745e-01 2.227e+00 -0.581 0.561572
HDLChol -7.635e-01 4.660e-01 1.670e+00 -0.457 0.647556
LDLChol -3.796e-01 6.841e-01 1.645e+00 -0.231 0.817495
HCS 8.513e-02 1.089e+00 2.830e-02 3.009 0.002625 **
CARO 1.681e-01 1.183e+00 1.875e-01 0.897 0.369947
CRP 4.701e-02 1.048e+00 4.232e-02 1.111 0.266691
Creatinine -8.598e-03 9.914e-01 6.450e-03 -1.333 0.182541
Fructosamine 1.022e-02 1.010e+00 3.041e-03 3.360 0.000780 ***
IL1 -1.225e-01 8.847e-01 7.212e-02 -1.699 0.089396 .
IL8 -2.137e-03 9.979e-01 1.059e-02 -0.202 0.840124
Insulin -3.182e-02 9.687e-01 2.323e-02 -1.370 0.170685
MIC1 7.394e-04 1.001e+00 2.071e-04 3.571 0.000356 ***
VitD -2.104e-05 1.000e+00 4.867e-03 -0.004 0.996551
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Concordance= 0.713 (se = 0.033 )
Rsquare= 0.067 (max possible= 0.697 )
Likelihood ratio test= 60.37 on 31 df, p=0.001
Wald test = 64.67 on 31 df, p=4e-04
Score (logrank) test = 65.66 on 31 df, p=3e-04
重新采样:
rdesc <- makeResampleDesc(method="CV", iters=5, stratify=TRUE)
r = resample("surv.coxph", surv.task, rdesc, models=TRUE)
r
Resample Result
Task: bb_imp
Learner: surv.coxph
Aggr perf: cindex.test.mean=0.5854492
Runtime: 0.149925
我知道设置 models=TRUE 可以保存单个模型,但我不确定如何访问它们。我试过了
summary(r$models[1])
但只得到:
Length Class Mode
[1,] 8 WrappedModel list
模型是一个列表。所以你必须 运行 summary(getLearnerModel(r$models[[1]]))
.
或者您可以使用 lapply
或类似的函数:
all_summaries = lapply(r$models, function(x) summary(getLearnerModel(x)))
如果我在 mlr 中训练单个 Cox PH 模型,我可以打印一个摘要,显示每个预测变量的统计显着性,如下所示。但是,如果我使用重采样,例如 5 折交叉验证,是否有任何方法可以获取此信息,可能是在 5 次迭代中聚合,或者甚至只是针对每次迭代分别进行?
训练单个模型:
surv.task <- makeSurvTask(data = bb_imp, target = c("timeToEvent", "status"))
surv.lrn <- makeLearner(cl="surv.coxph", predict.type="response")
blood_coxph_base <- train(surv.lrn, surv.task)
mod <- getLearnerModel(blood_coxph_base)
summary(mod)
Call:
survival::coxph(formula = f, data = data)
n= 873, number of events= 82
coef exp(coef) se(coef) z Pr(>|z|)
Glucose 1.336e-01 1.143e+00 1.320e-01 1.012 0.311516
Urate -1.293e+00 2.745e-01 2.227e+00 -0.581 0.561572
HDLChol -7.635e-01 4.660e-01 1.670e+00 -0.457 0.647556
LDLChol -3.796e-01 6.841e-01 1.645e+00 -0.231 0.817495
HCS 8.513e-02 1.089e+00 2.830e-02 3.009 0.002625 **
CARO 1.681e-01 1.183e+00 1.875e-01 0.897 0.369947
CRP 4.701e-02 1.048e+00 4.232e-02 1.111 0.266691
Creatinine -8.598e-03 9.914e-01 6.450e-03 -1.333 0.182541
Fructosamine 1.022e-02 1.010e+00 3.041e-03 3.360 0.000780 ***
IL1 -1.225e-01 8.847e-01 7.212e-02 -1.699 0.089396 .
IL8 -2.137e-03 9.979e-01 1.059e-02 -0.202 0.840124
Insulin -3.182e-02 9.687e-01 2.323e-02 -1.370 0.170685
MIC1 7.394e-04 1.001e+00 2.071e-04 3.571 0.000356 ***
VitD -2.104e-05 1.000e+00 4.867e-03 -0.004 0.996551
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Concordance= 0.713 (se = 0.033 )
Rsquare= 0.067 (max possible= 0.697 )
Likelihood ratio test= 60.37 on 31 df, p=0.001
Wald test = 64.67 on 31 df, p=4e-04
Score (logrank) test = 65.66 on 31 df, p=3e-04
重新采样:
rdesc <- makeResampleDesc(method="CV", iters=5, stratify=TRUE)
r = resample("surv.coxph", surv.task, rdesc, models=TRUE)
r
Resample Result
Task: bb_imp
Learner: surv.coxph
Aggr perf: cindex.test.mean=0.5854492
Runtime: 0.149925
我知道设置 models=TRUE 可以保存单个模型,但我不确定如何访问它们。我试过了
summary(r$models[1])
但只得到:
Length Class Mode
[1,] 8 WrappedModel list
模型是一个列表。所以你必须 运行 summary(getLearnerModel(r$models[[1]]))
.
或者您可以使用 lapply
或类似的函数:
all_summaries = lapply(r$models, function(x) summary(getLearnerModel(x)))