在 h2o 中集成 - 缺少模型
Ensembling in h2o - missing models
我正在使用 h2o
包从具有不同正则化参数(alpha、lambda)的 GLM 模型构建一个集成。当我尝试构建一个整体时,请遵循文档:
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble",
base_models = list(glm_grid@model_ids))
其中 glm_grid@model_ids
是来自网格搜索的模型,用于确定 GLM 的最佳 alpha
和 lambda
正则化参数。我收到以下错误:
When creating a StackedEnsemble you must specify one or more models; 24 were specified but none of those were found: [list("glm_grid_model_6", glm_grid_model_11, glm_grid_model_7, glm_grid_model_9, glm_grid_model_2, glm_grid_model_21, glm_grid_model_15, glm_grid_model_0"]
您知道问题出在哪里吗?
编辑:我尝试按照文档进行操作并使用了与该文档类似的代码:
gbm_grid <- h2o.grid(algorithm = "gbm",
grid_id = "gbm_grid_binomial",
x = x,
y = y,
training_frame = train,
ntrees = 10,
seed = 1,
nfolds = nfolds,
fold_assignment = "Modulo",
keep_cross_validation_predictions = TRUE,
hyper_params = hyper_params,
search_criteria = search_criteria)
# Train a stacked ensemble using the GBM grid
ensemble <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
model_id = "ensemble_gbm_grid_binomial",
base_models = gbm_grid@model_ids)
根据@Erin LeDell 的说法,我删除了额外的 list()
,现在可以使用了。然而,我最终想做的是使用来自各种模型的网格,比如:
ensemble <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
model_id = "my_ensemble_binomial",
base_models = list(my_gbm, my_rf))
编辑 2:
使用以下方法解决:
model_list <- as.list(c(glm_grid_1@model_ids,
glm_grid_2@model_ids))
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble1231",
base_models = model_list)
你有一个额外的 list()
包裹在 glm_grid@model_ids
周围,你在这里不需要,这可能是错误的来源。 glm_grid@model_ids
对象已经是一个列表。改为这样做:
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble",
base_models = glm_grid@model_ids)
有关详细信息,请参阅 R 示例 here。
我正在使用 h2o
包从具有不同正则化参数(alpha、lambda)的 GLM 模型构建一个集成。当我尝试构建一个整体时,请遵循文档:
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble",
base_models = list(glm_grid@model_ids))
其中 glm_grid@model_ids
是来自网格搜索的模型,用于确定 GLM 的最佳 alpha
和 lambda
正则化参数。我收到以下错误:
When creating a StackedEnsemble you must specify one or more models; 24 were specified but none of those were found: [list("glm_grid_model_6", glm_grid_model_11, glm_grid_model_7, glm_grid_model_9, glm_grid_model_2, glm_grid_model_21, glm_grid_model_15, glm_grid_model_0"]
您知道问题出在哪里吗?
编辑:我尝试按照文档进行操作并使用了与该文档类似的代码:
gbm_grid <- h2o.grid(algorithm = "gbm",
grid_id = "gbm_grid_binomial",
x = x,
y = y,
training_frame = train,
ntrees = 10,
seed = 1,
nfolds = nfolds,
fold_assignment = "Modulo",
keep_cross_validation_predictions = TRUE,
hyper_params = hyper_params,
search_criteria = search_criteria)
# Train a stacked ensemble using the GBM grid
ensemble <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
model_id = "ensemble_gbm_grid_binomial",
base_models = gbm_grid@model_ids)
根据@Erin LeDell 的说法,我删除了额外的 list()
,现在可以使用了。然而,我最终想做的是使用来自各种模型的网格,比如:
ensemble <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
model_id = "my_ensemble_binomial",
base_models = list(my_gbm, my_rf))
编辑 2:
使用以下方法解决:
model_list <- as.list(c(glm_grid_1@model_ids,
glm_grid_2@model_ids))
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble1231",
base_models = model_list)
你有一个额外的 list()
包裹在 glm_grid@model_ids
周围,你在这里不需要,这可能是错误的来源。 glm_grid@model_ids
对象已经是一个列表。改为这样做:
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = train,
model_id = "ensemble",
base_models = glm_grid@model_ids)
有关详细信息,请参阅 R 示例 here。