验证测试数据的准确性

Validate Accuracy of Test Data

我已经用我的训练数据拟合了我的模型,并使用 r 平方测试了模型的准确性。

但是,我想用我的测试数据来测试模型的准确性,怎么办?

我的预测值是连续的。对此很陌生,所以欢迎提出建议。

LR_swim <- lm(racetime_mins ~ event_month +gender + place +
             clocktime_mins +handicap_mins +
              Wind_Speed_knots+ 
             Air_Temp_Celsius +Water_Temp_Celsius +Wave_Height_m,
               data = SwimmingTrain) 
           family=gaussian(link = "identity")
summary(LR_swim)
rsq(LR_swim) #Returns-  0.9722331

#Predict Race_Time Using Test Data
 pred_LR <- predict(LR_swim, SwimmingTest, type ="response")
#Add predicted Race_Times back into the test dataset.
SwimmingTest$Pred_RaceTime <- pred_LR

首先,正如评论中已经指出的那样,术语 准确性 实际上是为分类问题保留的。您实际上指的是模型的性能。事实上,对于 回归 问题(例如您的问题),有多种此类性能指标可用。

无论好坏,R^2 仍然是一些实现中的标准度量;尽管如此,记住我所争论的内容可能会有所帮助 :

the whole R-squared concept comes in fact directly from the world of statistics, where the emphasis is on interpretative models, and it has little use in machine learning contexts, where the emphasis is clearly on predictive models; at least AFAIK, and beyond some very introductory courses, I have never (I mean never...) seen a predictive modeling problem where the R-squared is used for any kind of performance assessment; neither it's an accident that popular machine learning introductions, such as Andrew Ng's Machine Learning at Coursera, do not even bother to mention it. And, as noted in the Github thread above (emphasis added):

In particular when using a test set, it's a bit unclear to me what the R^2 means.

我当然同意。

还有一些其他性能指标可以说更适合 预测性 任务,例如您的任务;其中大部分都可以用一行简单的 R 代码来实现。所以,对于一些虚拟数据:

preds <- c(1.0, 2.0, 9.5)
actuals <- c(0.9, 2.1, 10.0)

mean squared error (MSE)就是

mean((preds-actuals)^2)
# [1] 0.09

mean absolute error (MAE)

mean(abs(preds-actuals))
# [1] 0.2333333

root mean squared error (RMSE) 只是 MSE 的平方根,即:

sqrt(mean((preds-actuals)^2))
# [1] 0.3

这些措施可以说对于评估未见数据的性能更有用。最后两个还有一个额外的优势,即与您的原始数据具有相同的规模(MSE 不是这种情况)。