"Wrong model type for classification" 在 R-Caret 的回归问题中
"Wrong model type for classification" in regression problems in R-Caret
我正在尝试使用 R 中 Caret 包中的各种预测算法来解决回归问题,即我的目标变量是连续的。 Caret 认为 classification 是问题的适当 class,当我通过任何回归模型时,我收到一条错误消息,指出 "wrong model type for classification"。为了再现性,让我们看看 Combined Cycle Power Plant Data Set。数据在 CCPP.zip 中。让我们预测功率作为其他变量的函数。功率是一个连续变量。
library(readxl)
library(caret)
power_plant = read_excel("Folds5x2_pp.xlsx")
apply(power_plant,2, class) # shows all columns are numeric
control <- trainControl(method="repeatedcv", number=10, repeats=5)
my_glm <- train(power_plant[,1:4], power_plant[,5],
method = "lm",
preProc = c("center", "scale"),
trControl = control)
下图是我的截图:
出于某种原因,caret
被 tibbles 弄糊涂了,tibbles 是 read_excel
returns 数据框的 tidyverse 变体。通过在将其提供给插入符之前将其转换为简单的数据框,一切正常:
library(readxl)
library(caret)
power_plant = read_excel("Folds5x2_pp.xlsx")
apply(power_plant,2, class) # shows all columns are numeric
power_plant <- data.frame(power_plant)
control <- trainControl(method="repeatedcv", number=10, repeats=5)
my_glm <- train(power_plant[,1:4], power_plant[,5],
method = "lm",
preProc = c("center", "scale"),
trControl = control)
my_glm
产量:
Linear Regression
9568 samples
4 predictor
Pre-processing: centered (4), scaled (4)
Resampling: Cross-Validated (10 fold, repeated 5 times)
Summary of sample sizes: 8612, 8612, 8611, 8612, 8612, 8610, ...
Resampling results:
RMSE Rsquared
4.556703 0.9287933
Tuning parameter 'intercept' was held constant at a value of TRUE
当我尝试使用公式 = y ~ x 时出现类似的错误,只需省略命名变量并使用 y ~ x 就可以很好地工作。
我正在尝试使用 R 中 Caret 包中的各种预测算法来解决回归问题,即我的目标变量是连续的。 Caret 认为 classification 是问题的适当 class,当我通过任何回归模型时,我收到一条错误消息,指出 "wrong model type for classification"。为了再现性,让我们看看 Combined Cycle Power Plant Data Set。数据在 CCPP.zip 中。让我们预测功率作为其他变量的函数。功率是一个连续变量。
library(readxl)
library(caret)
power_plant = read_excel("Folds5x2_pp.xlsx")
apply(power_plant,2, class) # shows all columns are numeric
control <- trainControl(method="repeatedcv", number=10, repeats=5)
my_glm <- train(power_plant[,1:4], power_plant[,5],
method = "lm",
preProc = c("center", "scale"),
trControl = control)
下图是我的截图:
出于某种原因,caret
被 tibbles 弄糊涂了,tibbles 是 read_excel
returns 数据框的 tidyverse 变体。通过在将其提供给插入符之前将其转换为简单的数据框,一切正常:
library(readxl)
library(caret)
power_plant = read_excel("Folds5x2_pp.xlsx")
apply(power_plant,2, class) # shows all columns are numeric
power_plant <- data.frame(power_plant)
control <- trainControl(method="repeatedcv", number=10, repeats=5)
my_glm <- train(power_plant[,1:4], power_plant[,5],
method = "lm",
preProc = c("center", "scale"),
trControl = control)
my_glm
产量:
Linear Regression
9568 samples
4 predictor
Pre-processing: centered (4), scaled (4)
Resampling: Cross-Validated (10 fold, repeated 5 times)
Summary of sample sizes: 8612, 8612, 8611, 8612, 8612, 8610, ...
Resampling results:
RMSE Rsquared
4.556703 0.9287933
Tuning parameter 'intercept' was held constant at a value of TRUE
当我尝试使用公式 = y ~ x 时出现类似的错误,只需省略命名变量并使用 y ~ x 就可以很好地工作。