使用自定义范围(或值)调整工作流程集中的配方
Tune recipe in workflow set with custom range (or value)
我正在尝试使用 tidymodels 中的 workflow_set() 函数来评估一批模型。
我知道可以修改某些模型规范以更改搜索范围,例如,给定此规范:
spec_lin <- linear_reg( penalty = tune(),
mixture = tune() ) %>%
set_engine('glmnet')
我可以修改范围:
rec_base <- recipe( price ~ feat_1) %>%
step_novel(feat_1) %>%
step_other(feat_1,threshold=.2 ) %>%
step_dummy(feat_1)
rec_adv_param <- rec_base %>%
parameters() %>%
update ( mixture = mixture(c(0.1,0.01)) )
我的尝试是做同样的事情,但使用配方中的参数。例如:
rec_tuned <- recipe( price ~ feat_1) %>%
step_novel(feat_1) %>%
step_other(feat_1,threshold=tune() ) %>%
step_dummy(feat_1)
接着是
rec_adv_param <- rec_tuned %>%
parameters() %>%
update ( threshold = threshold(c(0.1,0.2)) )
然而,当我尝试在 workflow_set() 定义中使用它时,如果我使用类似
wf_set <- workflow_set(recipes, models, cross = TRUE )
option_add(param_info = rec_adv_param, id = "rec_tuned_spec_lin")
结局“wf_set”失去了他原来的调音参数,已被更改为
threshold = threshold(c(0.1,0.2)
有没有办法在所有 workflow_set 模型中为配方添加参数规范?
谢谢
您可以通过 option_add()
为配方添加参数,如果您离开 id = NULL
,则可以通过 id
为所有工作流程添加单个工作流程的参数。当您调整或适应重新采样的数据时,将使用这些选项。
例如,如果我们想尝试 0 到 20 个 PCA 组件(而不是默认值):
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
data(Chicago)
data("chi_features_set")
time_val_split <-
sliding_period(
Chicago,
date,
"month",
lookback = 38,
assess_stop = 1
)
## notice that there are no options; defaults will be used
chi_features_set
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[0]> <list [0]>
## make new params
pca_param <-
parameters(num_comp()) %>%
update(num_comp = num_comp(c(0, 20)))
## add new params to workflowset like this:
chi_features_set %>%
option_add(param_info = pca_param, id = "plus_pca_lm")
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[1]> <list [0]>
## now these new parameters can be used by `workflow_map()`:
chi_features_set %>%
option_add(param_info = pca_param, id = "plus_pca_lm") %>%
workflow_map(resamples = time_val_split, grid = 21, seed = 1)
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[2]> <rsmp[+]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[2]> <rsmp[+]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[3]> <tune[+]>
由 reprex package (v2.0.0)
于 2021-07-30 创建
我正在尝试使用 tidymodels 中的 workflow_set() 函数来评估一批模型。 我知道可以修改某些模型规范以更改搜索范围,例如,给定此规范:
spec_lin <- linear_reg( penalty = tune(),
mixture = tune() ) %>%
set_engine('glmnet')
我可以修改范围:
rec_base <- recipe( price ~ feat_1) %>%
step_novel(feat_1) %>%
step_other(feat_1,threshold=.2 ) %>%
step_dummy(feat_1)
rec_adv_param <- rec_base %>%
parameters() %>%
update ( mixture = mixture(c(0.1,0.01)) )
我的尝试是做同样的事情,但使用配方中的参数。例如:
rec_tuned <- recipe( price ~ feat_1) %>%
step_novel(feat_1) %>%
step_other(feat_1,threshold=tune() ) %>%
step_dummy(feat_1)
接着是
rec_adv_param <- rec_tuned %>%
parameters() %>%
update ( threshold = threshold(c(0.1,0.2)) )
然而,当我尝试在 workflow_set() 定义中使用它时,如果我使用类似
wf_set <- workflow_set(recipes, models, cross = TRUE )
option_add(param_info = rec_adv_param, id = "rec_tuned_spec_lin")
结局“wf_set”失去了他原来的调音参数,已被更改为
threshold = threshold(c(0.1,0.2)
有没有办法在所有 workflow_set 模型中为配方添加参数规范?
谢谢
您可以通过 option_add()
为配方添加参数,如果您离开 id = NULL
,则可以通过 id
为所有工作流程添加单个工作流程的参数。当您调整或适应重新采样的数据时,将使用这些选项。
例如,如果我们想尝试 0 到 20 个 PCA 组件(而不是默认值):
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
data(Chicago)
data("chi_features_set")
time_val_split <-
sliding_period(
Chicago,
date,
"month",
lookback = 38,
assess_stop = 1
)
## notice that there are no options; defaults will be used
chi_features_set
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[0]> <list [0]>
## make new params
pca_param <-
parameters(num_comp()) %>%
update(num_comp = num_comp(c(0, 20)))
## add new params to workflowset like this:
chi_features_set %>%
option_add(param_info = pca_param, id = "plus_pca_lm")
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[0]> <list [0]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[1]> <list [0]>
## now these new parameters can be used by `workflow_map()`:
chi_features_set %>%
option_add(param_info = pca_param, id = "plus_pca_lm") %>%
workflow_map(resamples = time_val_split, grid = 21, seed = 1)
#> # A workflow set/tibble: 3 × 4
#> wflow_id info option result
#> <chr> <list> <list> <list>
#> 1 date_lm <tibble [1 × 4]> <opts[2]> <rsmp[+]>
#> 2 plus_holidays_lm <tibble [1 × 4]> <opts[2]> <rsmp[+]>
#> 3 plus_pca_lm <tibble [1 × 4]> <opts[3]> <tune[+]>
由 reprex package (v2.0.0)
于 2021-07-30 创建