MLR:我怎样才能围绕学习者选择特定的特征?
MLR: How can I wrap the selection of specified features around the learner?
我想比较简单逻辑回归模型,其中每个模型仅考虑一组指定的特征。我想对这些回归模型对数据的重新采样进行比较。
R 包 mlr
允许我使用 dropFeatures
在任务级别 select 列。代码类似于:
full_task = makeClassifTask(id = "full task", data = my_data, target = "target")
reduced_task = dropFeatures(full_task, setdiff( getTaskFeatureNames(full_task), list_feat_keep))
然后我可以在有任务列表的地方进行基准测试。
lrn = makeLearner("classif.logreg", predict.type = "prob")
rdesc = makeResampleDesc(method = "Bootstrap", iters = 50, stratify = TRUE)
bmr = benchmark(lrn, list(full_task, reduced_task), rdesc, measures = auc, show.info = FALSE)
如何生成只考虑一组指定特征的学习器。
据我所知,过滤器或 selection 方法总是应用一些统计
程序但不允许直接 select 功能。谢谢!
第一个解决方案是惰性的,也不是最优的,因为过滤计算仍在进行:
library(mlr)
task = sonar.task
sel.feats = c("V1", "V10")
lrn = makeLearner("classif.logreg", predict.type = "prob")
lrn.reduced = makeFilterWrapper(learner = lrn, fw.method = "variance", fw.abs = 2, fw.mandatory.feat = sel.feats)
bmr = benchmark(list(lrn, lrn.reduced), task, cv3, measures = auc, show.info = FALSE)
第二种使用预处理包装器过滤数据,应该是最快的解决方案,也更灵活:
lrn.reduced.2 = makePreprocWrapper(
learner = lrn,
train = function(data, target, args) list(data = data[, c(sel.feats, target)], control = list()),
predict = function(data, target, args, control) data[, sel.feats]
)
bmr = benchmark(list(lrn, lrn.reduced.2), task, cv3, measures = auc, show.info = FALSE)
我想比较简单逻辑回归模型,其中每个模型仅考虑一组指定的特征。我想对这些回归模型对数据的重新采样进行比较。
R 包 mlr
允许我使用 dropFeatures
在任务级别 select 列。代码类似于:
full_task = makeClassifTask(id = "full task", data = my_data, target = "target")
reduced_task = dropFeatures(full_task, setdiff( getTaskFeatureNames(full_task), list_feat_keep))
然后我可以在有任务列表的地方进行基准测试。
lrn = makeLearner("classif.logreg", predict.type = "prob")
rdesc = makeResampleDesc(method = "Bootstrap", iters = 50, stratify = TRUE)
bmr = benchmark(lrn, list(full_task, reduced_task), rdesc, measures = auc, show.info = FALSE)
如何生成只考虑一组指定特征的学习器。 据我所知,过滤器或 selection 方法总是应用一些统计 程序但不允许直接 select 功能。谢谢!
第一个解决方案是惰性的,也不是最优的,因为过滤计算仍在进行:
library(mlr)
task = sonar.task
sel.feats = c("V1", "V10")
lrn = makeLearner("classif.logreg", predict.type = "prob")
lrn.reduced = makeFilterWrapper(learner = lrn, fw.method = "variance", fw.abs = 2, fw.mandatory.feat = sel.feats)
bmr = benchmark(list(lrn, lrn.reduced), task, cv3, measures = auc, show.info = FALSE)
第二种使用预处理包装器过滤数据,应该是最快的解决方案,也更灵活:
lrn.reduced.2 = makePreprocWrapper(
learner = lrn,
train = function(data, target, args) list(data = data[, c(sel.feats, target)], control = list()),
predict = function(data, target, args, control) data[, sel.feats]
)
bmr = benchmark(list(lrn, lrn.reduced.2), task, cv3, measures = auc, show.info = FALSE)