在预测之前是否有必要对数据进行居中和缩放？

Question

在 caret 包的训练函数中，可以执行预测变量的居中和缩放，如下例所示：

knnFit <- train(Direction ~ ., data = training, method = "knn",
                preProcess = c("center","scale"))

在 train 中设置此转换应该可以更好地评估算法在重采样期间的性能。

在这种情况下，当我使用模型预测新数据的响应时，我应该关心居中和缩放还是该操作包含在最终模型中？

下面的操作够用吗？

pred <- predict(knnFit, newdata = test)

谢谢！

Answer 1

train 对象中指定的preProces 将应用于新数据，而无需先对新数据进行预处理。所以你的操作就够了。

另请查看下面插入符号网站的摘录。还有一整节纯粹是关于预处理的。绝对值得您花时间通读。

您可以在此处找到 caret website。

These processing steps would be applied during any predictions generated using predict.train, extractPrediction or extractProbs (see details later in this document). The pre-processing would not be applied to predictions that directly use the object$finalModel object.

在预测之前是否有必要对数据进行居中和缩放？

is it necessary to center and scale data before predicting?

r

r-caret