mlr:带调整的过滤方法
mlr: Filter Methods with Tuning
ml 教程的这一部分:https://mlr.mlr-org.com/articles/tutorial/nested_resampling.html#filter-methods-with-tuning 解释了如何使用 TuneWrapper 和 FilterWrapper 来调整过滤器的阈值。但是,如果我的过滤器也有需要调整的超参数,例如随机森林变量重要性过滤器,该怎么办?我似乎无法调整除阈值之外的任何参数。
例如:
library(survival)
library(mlr)
data(veteran)
set.seed(24601)
task_id = "MAS"
mas.task <- makeSurvTask(id = task_id, data = veteran, target = c("time", "status"))
mas.task <- createDummyFeatures(mas.task)
tuning = makeResampleDesc("CV", iters=5, stratify=TRUE) # Tuning: 5-fold CV, no repeats
cox.filt.rsfrc.lrn = makeTuneWrapper(
makeFilterWrapper(
makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response"),
fw.method="randomForestSRC_importance",
cache=TRUE,
ntree=2000
),
resampling = tuning,
par.set = makeParamSet(
makeIntegerParam("fw.abs", lower=2, upper=10),
makeIntegerParam("mtry", lower = 5, upper = 15),
makeIntegerParam("nodesize", lower=3, upper=25)
),
control = makeTuneControlRandom(maxit=20),
show.info = TRUE)
产生错误信息:
checkTunerParset 错误(学习者,par.set,测量,控制):
只能调整存在学习器参数的参数:mtry,nodesize
有什么方法可以调整随机森林的超参数吗?
编辑:其他尝试遵循评论中的建议:
在输入过滤器之前将调谐器包裹在基础学习器周围(过滤器未显示)- 失败
cox.lrn = makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response")
cox.tune = makeTuneWrapper(cox.lrn,
resampling = tuning,
measures=list(cindex),
par.set = makeParamSet(
makeIntegerParam("mtry", lower = 5, upper = 15),
makeIntegerParam("nodesize", lower=3, upper=25),
makeIntegerParam("fw.abs", lower=2, upper=10)
),
control = makeTuneControlRandom(maxit=20),
show.info = TRUE)
Error in checkTunerParset(learner, par.set, measures, control) :
Can only tune parameters for which learner parameters exist: mtry,nodesize,fw.abs
两级调整 - 失败
cox.lrn = makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response")
cox.filt = makeFilterWrapper(cox.lrn,
fw.method="randomForestSRC_importance",
cache=TRUE,
ntree=2000)
cox.tune = makeTuneWrapper(cox.filt,
resampling = tuning,
measures=list(cindex),
par.set = makeParamSet(
makeIntegerParam("fw.abs", lower=2, upper=10)
),
control = makeTuneControlRandom(maxit=20),
show.info = TRUE)
cox.tune2 = makeTuneWrapper(cox.tune,
resampling = tuning,
measures=list(cindex),
par.set = makeParamSet(
makeIntegerParam("mtry", lower = 5, upper = 15),
makeIntegerParam("nodesize", lower=3, upper=25)
),
control = makeTuneControlRandom(maxit=20),
show.info = TRUE)
Error in makeBaseWrapper(id, learner$type, learner, learner.subclass = c(learner.subclass, :
Cannot wrap a tuning wrapper around another optimization wrapper!
您目前似乎无法调整过滤器的超参数。您可以通过在 makeFilterWrapper()
中传递某些参数来手动更改它们,但不能调整它们。
在过滤时,您只能调整 fw.abs
、fw.perc
或 fw.tresh
之一。
我不知道随机森林过滤器使用不同的hyperpars会对排名产生多大的影响。检查稳健性的一种方法是在 getFeatureImportance()
的帮助下比较 mtry
和朋友使用不同设置的单个 RF 模型拟合的排名。如果它们之间存在非常高的等级相关性,您可以安全地忽略 RF 滤波器的调整。 (也许您想使用完全不会出现此问题的不同过滤器?)
如果你坚持拥有这个功能,你可能需要为这个包提高 PR :)
lrn = makeLearner(cl = "surv.coxph", id = "cox.filt.rfsrc", predict.type = "response")
filter_wrapper = makeFilterWrapper(
lrn,
fw.method = "randomForestSRC_importance",
cache = TRUE,
ntrees = 2000
)
cox.filt.rsfrc.lrn = makeTuneWrapper(
filter_wrapper,
resampling = tuning,
par.set = makeParamSet(
makeIntegerParam("fw.abs", lower = 2, upper = 10)
),
control = makeTuneControlRandom(maxit = 20),
show.info = TRUE)
ml 教程的这一部分:https://mlr.mlr-org.com/articles/tutorial/nested_resampling.html#filter-methods-with-tuning 解释了如何使用 TuneWrapper 和 FilterWrapper 来调整过滤器的阈值。但是,如果我的过滤器也有需要调整的超参数,例如随机森林变量重要性过滤器,该怎么办?我似乎无法调整除阈值之外的任何参数。
例如:
library(survival)
library(mlr)
data(veteran)
set.seed(24601)
task_id = "MAS"
mas.task <- makeSurvTask(id = task_id, data = veteran, target = c("time", "status"))
mas.task <- createDummyFeatures(mas.task)
tuning = makeResampleDesc("CV", iters=5, stratify=TRUE) # Tuning: 5-fold CV, no repeats
cox.filt.rsfrc.lrn = makeTuneWrapper(
makeFilterWrapper(
makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response"),
fw.method="randomForestSRC_importance",
cache=TRUE,
ntree=2000
),
resampling = tuning,
par.set = makeParamSet(
makeIntegerParam("fw.abs", lower=2, upper=10),
makeIntegerParam("mtry", lower = 5, upper = 15),
makeIntegerParam("nodesize", lower=3, upper=25)
),
control = makeTuneControlRandom(maxit=20),
show.info = TRUE)
产生错误信息:
checkTunerParset 错误(学习者,par.set,测量,控制): 只能调整存在学习器参数的参数:mtry,nodesize
有什么方法可以调整随机森林的超参数吗?
编辑:其他尝试遵循评论中的建议:
在输入过滤器之前将调谐器包裹在基础学习器周围(过滤器未显示)- 失败
cox.lrn = makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response") cox.tune = makeTuneWrapper(cox.lrn, resampling = tuning, measures=list(cindex), par.set = makeParamSet( makeIntegerParam("mtry", lower = 5, upper = 15), makeIntegerParam("nodesize", lower=3, upper=25), makeIntegerParam("fw.abs", lower=2, upper=10) ), control = makeTuneControlRandom(maxit=20), show.info = TRUE) Error in checkTunerParset(learner, par.set, measures, control) : Can only tune parameters for which learner parameters exist: mtry,nodesize,fw.abs
两级调整 - 失败
cox.lrn = makeLearner(cl="surv.coxph", id = "cox.filt.rfsrc", predict.type="response") cox.filt = makeFilterWrapper(cox.lrn, fw.method="randomForestSRC_importance", cache=TRUE, ntree=2000) cox.tune = makeTuneWrapper(cox.filt, resampling = tuning, measures=list(cindex), par.set = makeParamSet( makeIntegerParam("fw.abs", lower=2, upper=10) ), control = makeTuneControlRandom(maxit=20), show.info = TRUE) cox.tune2 = makeTuneWrapper(cox.tune, resampling = tuning, measures=list(cindex), par.set = makeParamSet( makeIntegerParam("mtry", lower = 5, upper = 15), makeIntegerParam("nodesize", lower=3, upper=25) ), control = makeTuneControlRandom(maxit=20), show.info = TRUE) Error in makeBaseWrapper(id, learner$type, learner, learner.subclass = c(learner.subclass, : Cannot wrap a tuning wrapper around another optimization wrapper!
您目前似乎无法调整过滤器的超参数。您可以通过在 makeFilterWrapper()
中传递某些参数来手动更改它们,但不能调整它们。
在过滤时,您只能调整 fw.abs
、fw.perc
或 fw.tresh
之一。
我不知道随机森林过滤器使用不同的hyperpars会对排名产生多大的影响。检查稳健性的一种方法是在 getFeatureImportance()
的帮助下比较 mtry
和朋友使用不同设置的单个 RF 模型拟合的排名。如果它们之间存在非常高的等级相关性,您可以安全地忽略 RF 滤波器的调整。 (也许您想使用完全不会出现此问题的不同过滤器?)
如果你坚持拥有这个功能,你可能需要为这个包提高 PR :)
lrn = makeLearner(cl = "surv.coxph", id = "cox.filt.rfsrc", predict.type = "response")
filter_wrapper = makeFilterWrapper(
lrn,
fw.method = "randomForestSRC_importance",
cache = TRUE,
ntrees = 2000
)
cox.filt.rsfrc.lrn = makeTuneWrapper(
filter_wrapper,
resampling = tuning,
par.set = makeParamSet(
makeIntegerParam("fw.abs", lower = 2, upper = 10)
),
control = makeTuneControlRandom(maxit = 20),
show.info = TRUE)