将指标添加到 caret 包的默认 train() 输出
Adding metrics to default train() output from the caret package
我想将 RMSE 和 Rsquared 以外的其他指标添加到我使用 caret 包创建的线性模型的输出中。据我了解,下面的代码将输出重复的交叉验证 RMSE 和 Rsquared:
library(caret)
lm_reg1 <- train(log1p(mpg) ~ log1p(hp) + log1p(disp),
data = mtcars,
trControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 10),
method = 'lm')
lm_reg
输出:
Linear Regression
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 30, 29, 28, 29, 29, 28, ...
Resampling results:
RMSE Rsquared
0.1134972 0.8808378
我知道我可以通过修改 trainControl 中的 summaryFunction 并在度量参数中引用它的名称来将输出修改为自定义度量。这是我创建的一个计算对数对数模型 MAPE 的示例:
mape <- function(actual, predicted){
mean(abs((actual - predicted)/actual))
}
mapeexpSummary <- function (data,
lev = NULL,
model = NULL) {
out <- mape(expm1(data$obs), expm1(data$pred))
names(out) <- "MAPEEXP"
out
}
lm_reg2 <- train(log1p(mpg) ~ log1p(hp) + log1p(disp),
data = mtcars,
trControl = trainControl(method = "repeatedcv",
number = 10,
summaryFunction = mapeexpSummary,
repeats = 10),
metric = 'MAPEEXP',
method = 'lm')
lm_reg2
输出:
Linear Regression
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 28, 29, 29, 28, 28, 30, ...
Resampling results:
MAPEEXP
0.1022028
有什么方法可以将它们添加到单个输出中吗?我希望保存所有这些值,但要避免为此创建两个相同的模型。
在 mapeexpSummary
?
中添加 RMSE 和 R 平方
mapeexpSummary <- function (data,
lev = NULL,
model = NULL) {
c(MAPEEXP=mape(expm1(data$obs), expm1(data$pred)),
RMSE=sqrt(mean((data$obs-data$pred)^2)),
Rsquared=summary(lm(pred ~ obs, data))$r.squared)
}
我想将 RMSE 和 Rsquared 以外的其他指标添加到我使用 caret 包创建的线性模型的输出中。据我了解,下面的代码将输出重复的交叉验证 RMSE 和 Rsquared:
library(caret)
lm_reg1 <- train(log1p(mpg) ~ log1p(hp) + log1p(disp),
data = mtcars,
trControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 10),
method = 'lm')
lm_reg
输出:
Linear Regression
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 30, 29, 28, 29, 29, 28, ...
Resampling results:
RMSE Rsquared
0.1134972 0.8808378
我知道我可以通过修改 trainControl 中的 summaryFunction 并在度量参数中引用它的名称来将输出修改为自定义度量。这是我创建的一个计算对数对数模型 MAPE 的示例:
mape <- function(actual, predicted){
mean(abs((actual - predicted)/actual))
}
mapeexpSummary <- function (data,
lev = NULL,
model = NULL) {
out <- mape(expm1(data$obs), expm1(data$pred))
names(out) <- "MAPEEXP"
out
}
lm_reg2 <- train(log1p(mpg) ~ log1p(hp) + log1p(disp),
data = mtcars,
trControl = trainControl(method = "repeatedcv",
number = 10,
summaryFunction = mapeexpSummary,
repeats = 10),
metric = 'MAPEEXP',
method = 'lm')
lm_reg2
输出:
Linear Regression
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 28, 29, 29, 28, 28, 30, ...
Resampling results:
MAPEEXP
0.1022028
有什么方法可以将它们添加到单个输出中吗?我希望保存所有这些值,但要避免为此创建两个相同的模型。
在 mapeexpSummary
?
mapeexpSummary <- function (data,
lev = NULL,
model = NULL) {
c(MAPEEXP=mape(expm1(data$obs), expm1(data$pred)),
RMSE=sqrt(mean((data$obs-data$pred)^2)),
Rsquared=summary(lm(pred ~ obs, data))$r.squared)
}