在 mlr3 的基准测试中使用预定义的训练集和测试集
Using pre-defined train and test sets in a benchmark in mlr3
我想使用 mlr3 中的 benchmark_grid() 函数在分类任务中比较几种机器学习算法。根据 https://mlr3book.mlr-org.com/benchmarking.html benchmark_grid() 采用重采样方案将任务中的日期划分为训练和测试数据。但是,我想使用手动分区。使用 benchmark_grid() 时如何手动指定训练集和测试集?
编辑:
基于 pat-s
建议的代码示例
# use benchmark() from mlr3 to compare different classification models on the iris data set using a manually
# pre-defined partitioning into training and test data sets (hold-out sampling)
library("mlr3verse")
# Instantiate Task
task = tsk("iris")
# Instantiate Custom Resampling
# hold-out sample with pre-defined partitioning into train and test set
custom = rsmp("custom")
train_sets = list(1:120)
test_sets = list(121:150)
custom$instantiate(task, train_sets, test_sets)
design = benchmark_grid(
tasks = task,
learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"),
predict_type = "prob", predict_sets = c("train", "test")),
resamplings = custom
)
print(design)
# execute the benchmark
bmr = benchmark(design)
measure = msr("classif.acc")
tab = bmr$aggregate(measure)
print(tab)
您可以使用 "custom_cv"
重采样方案。
我想使用 mlr3 中的 benchmark_grid() 函数在分类任务中比较几种机器学习算法。根据 https://mlr3book.mlr-org.com/benchmarking.html benchmark_grid() 采用重采样方案将任务中的日期划分为训练和测试数据。但是,我想使用手动分区。使用 benchmark_grid() 时如何手动指定训练集和测试集?
编辑: 基于 pat-s
建议的代码示例# use benchmark() from mlr3 to compare different classification models on the iris data set using a manually
# pre-defined partitioning into training and test data sets (hold-out sampling)
library("mlr3verse")
# Instantiate Task
task = tsk("iris")
# Instantiate Custom Resampling
# hold-out sample with pre-defined partitioning into train and test set
custom = rsmp("custom")
train_sets = list(1:120)
test_sets = list(121:150)
custom$instantiate(task, train_sets, test_sets)
design = benchmark_grid(
tasks = task,
learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"),
predict_type = "prob", predict_sets = c("train", "test")),
resamplings = custom
)
print(design)
# execute the benchmark
bmr = benchmark(design)
measure = msr("classif.acc")
tab = bmr$aggregate(measure)
print(tab)
您可以使用 "custom_cv"
重采样方案。