有没有办法改变插入符号 R 中模型内分类的阈值?

Is there a way to change threshold of a classification within a model in caret R?

我想更改模型的阈值并且遇到了 post 就像在交叉验证线程 How to change threshold for classification in R randomForests?

如果我更改阈值 post 创建模型,这意味着我将不得不再次调整测试数据或新数据。

在 R & caret 中有没有办法 更改 模型中的阈值,以便我可以 运行 具有相同阈值的相同模型新数据还是测试数据?

概率分类器中,例如随机森林,没有模型拟合过程中涉及的任何阈值,也没有任何阈值与拟合模型相关的阈值;因此,实际上没有什么可以改变的。正如 CV 线程 Reduce Classification Probability Threshold:

中正确指出的那样

Choosing a threshold beyond which you classify a new observation as 1 vs. 0 is not part of the statistics any more. It is part of the decision component.

引用我自己在中的回答:

There is simply no threshold during model training; Random Forest is a probabilistic classifier, and it only outputs class probabilities. "Hard" classes (i.e. 0/1), which indeed require a threshold, are neither produced nor used in any stage of the model training - only during prediction, and even then only in the cases we indeed require a hard classification (not always the case). Please see for more details.

因此,如果您使用参数 type = "prob" 从拟合模型生成预测,例如 rf,如您链接到的 CV 线程所示:

pred <- predict(rf, mydata, type = "prob")

这些预测将是 [0, 1] 中的概率值,而不是 类 0/1 中的概率值。从这里,您可以自由选择阈值,如那里的答案所示,即:

thresh <- 0.6  # any desired value in [0, 1]
class_pred <- c()
class_pred[pred <= thresh] <- 0
class_pred[pred >  thresh] <- 1

或者当然可以尝试不同的阈值,而无需更改模型本身的任何内容。