如何仅在 mlr3pipelines 中的训练集上应用 pipeline_smote？

Question

我正在使用 mlr3 处理具有两个 class 响应变量的不平衡数据集。我想应用 SMOTE 方法对少数人进行过采样。我了解到这种方法只能用在训练集上，不能用在测试集上。但是，如果我没有误解的话，mlr3 管道在设置任务之前操作整个数据集，在此期间将数据集拆分为训练集和测试集。我想知道如何仅在训练集上应用 SMOTE 方法 (mlr_pipeops_smote)？

Answer 1

它仅自动应用于训练集；见 the documentation:

The output during prediction is the unchanged input.

如何仅在 mlr3pipelines 中的训练集上应用 pipeline_smote？

How to apply pipeline_smote just on training set in mlr3pipelines?

r

mlr3