从插入符号获取 运行 gbm 错误:{ 错误:任务 1 失败 - "inputs must be factors"
Getting Error in running gbm from caret: Error in { : task 1 failed - "inputs must be factors"
我是 R 的新手,正在尝试在 r 中学习和执行 ml。
我在 运行 gbm
从 caret
收到此错误:Error in { : task 1 failed - "inputs must be factors"
.
与 parameters
一样,它 运行 非常适合许多其他算法,例如 - rf
、adaboost
等
参考代码:
fitCtrl_2 <- trainControl(
method = "cv",
# repeats = 5,
number = 10,
savePredictions = "final",
classProbs = TRUE,
summaryFunction = twoClassSummary
)
下面的代码出错
set.seed(123)
system.time(
model_gbm <- train(pull(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired,
data = train,
method = "gbm", # Added for gbm
distribution="gaussian", # Added for gbm
metric = "ROC",
bag.fraction=0.75, # Added for gbm
# tuneLenth = 10,
trControl = fitCtrl_2)
)
下面的代码 运行 完全符合相同的数据
支持向量机模型
set.seed(123)
system.time(
model_svm <- train(pull(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired,
data = train,
method = "svmRadial",
tuneLenth = 10,
trControl = fitCtrl_2)
)
我浏览了有关此问题的其他 SO 帖子,但不清楚我究竟需要做什么来解决它。
看来你是在做分类,如果是这样,分布应该是“bernoulli”而不是“gaussian”,下面是一个例子:
set.seed(111)
df = data.frame(matrix(rnorm(1600),ncol=16))
colnames(df) = c("duration", "nr.employed", "euribor3m", "pdays", "emp.var.rate",
"poutcome.success", "month.mar", "cons.conf.idx", "contact.telephone",
"contact.cellular", "previous", "age", "cons.price.idx", "month.jun",
"job.retired")
df$y = ifelse(runif(100)>0.5,"a","b")
mod = as.formula("y ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired")
model_gbm <- train(mod, data = df,
method = "gbm",
distribution="gaussian",
metric = "ROC",
bag.fraction=0.75,
trControl = fitCtrl_2)
你得到一个错误:
Error in { : task 1 failed - "inputs must be factors"
设置成伯努利就可以了:
model_gbm <- train(mod, data = df,
method = "gbm",
distribution="bernoulli",
metric = "ROC",
bag.fraction=0.75,
trControl = fitCtrl_2)
model_gbm
Stochastic Gradient Boosting
100 samples
15 predictor
2 classes: 'a', 'b'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Resampling results across tuning parameters:
interaction.depth n.trees ROC Sens Spec
1 50 0.6338333 0.7233333 0.500
1 100 0.6093333 0.6533333 0.510
1 150 0.6193333 0.6500000 0.555
2 50 0.6445000 0.6900000 0.545
2 100 0.6138333 0.6166667 0.620
2 150 0.6085000 0.6700000 0.555
3 50 0.5770000 0.6466667 0.555
3 100 0.5756667 0.6066667 0.530
3 150 0.5808333 0.6300000 0.530
我是 R 的新手,正在尝试在 r 中学习和执行 ml。
我在 运行 gbm
从 caret
收到此错误:Error in { : task 1 failed - "inputs must be factors"
.
与 parameters
一样,它 运行 非常适合许多其他算法,例如 - rf
、adaboost
等
参考代码:
fitCtrl_2 <- trainControl(
method = "cv",
# repeats = 5,
number = 10,
savePredictions = "final",
classProbs = TRUE,
summaryFunction = twoClassSummary
)
下面的代码出错
set.seed(123)
system.time(
model_gbm <- train(pull(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired,
data = train,
method = "gbm", # Added for gbm
distribution="gaussian", # Added for gbm
metric = "ROC",
bag.fraction=0.75, # Added for gbm
# tuneLenth = 10,
trControl = fitCtrl_2)
)
下面的代码 运行 完全符合相同的数据
支持向量机模型
set.seed(123)
system.time(
model_svm <- train(pull(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired,
data = train,
method = "svmRadial",
tuneLenth = 10,
trControl = fitCtrl_2)
)
我浏览了有关此问题的其他 SO 帖子,但不清楚我究竟需要做什么来解决它。
看来你是在做分类,如果是这样,分布应该是“bernoulli”而不是“gaussian”,下面是一个例子:
set.seed(111)
df = data.frame(matrix(rnorm(1600),ncol=16))
colnames(df) = c("duration", "nr.employed", "euribor3m", "pdays", "emp.var.rate",
"poutcome.success", "month.mar", "cons.conf.idx", "contact.telephone",
"contact.cellular", "previous", "age", "cons.price.idx", "month.jun",
"job.retired")
df$y = ifelse(runif(100)>0.5,"a","b")
mod = as.formula("y ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired")
model_gbm <- train(mod, data = df,
method = "gbm",
distribution="gaussian",
metric = "ROC",
bag.fraction=0.75,
trControl = fitCtrl_2)
你得到一个错误:
Error in { : task 1 failed - "inputs must be factors"
设置成伯努利就可以了:
model_gbm <- train(mod, data = df,
method = "gbm",
distribution="bernoulli",
metric = "ROC",
bag.fraction=0.75,
trControl = fitCtrl_2)
model_gbm
Stochastic Gradient Boosting
100 samples
15 predictor
2 classes: 'a', 'b'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Resampling results across tuning parameters:
interaction.depth n.trees ROC Sens Spec
1 50 0.6338333 0.7233333 0.500
1 100 0.6093333 0.6533333 0.510
1 150 0.6193333 0.6500000 0.555
2 50 0.6445000 0.6900000 0.545
2 100 0.6138333 0.6166667 0.620
2 150 0.6085000 0.6700000 0.555
3 50 0.5770000 0.6466667 0.555
3 100 0.5756667 0.6066667 0.530
3 150 0.5808333 0.6300000 0.530