Error: The first argument to [fit_resamples()] should be either a model or workflow
Error: The first argument to [fit_resamples()] should be either a model or workflow
问题:
我正在学习 Julia Silge (link here) 关于使用 tidymodels 和 recipes 的教程。我可以毫无问题地完成大部分工作,但是当我调用 fit_resamples()
函数时出现错误:Error: The first argument to [fit_resamples()] should be either a model or workflow.
我正在将教程中的代码复制到字符中,一切都运行良好,包括打印出来 validation_splits
。但是,一旦我调用 fit_resamples()
,我就会收到上面的错误 (link to relevant part of tutorial)。如果有用,rlang::last_error()
的输出是:
<error/rlang_error>
The first argument to [fit_resamples()] should be either a model or workflow.
Backtrace:
1. tune::fit_resamples(...)
2. tune:::fit_resamples.default(...)
有人知道这里发生了什么吗?我该如何解决?我的理解是,我传递给 fit_resamples()
的第一个参数是 一个模型,即 character ~ .
,并且我在脚本没有问题。有关导致我的机器和我的 sessionInfo() 错误的代码(和数据),请参见下文。
可重现的例子:
library(tidyverse)
## Bring in data
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')
hotel_stays <- hotels %>%
filter(is_canceled == 0) %>%
mutate(children = case_when(children + babies > 0 ~ 'children',
TRUE ~ 'none'),
required_car_parking_spaces = case_when(required_car_parking_spaces > 0 ~ 'parking',
TRUE ~ 'none')) %>%
select(-is_canceled, -reservation_status, -babies)
hotels_df <- hotel_stays %>%
select(children, hotel, arrival_date_month, meal, adr, adults,
required_car_parking_spaces, total_of_special_requests,
stays_in_week_nights, stays_in_weekend_nights) %>%
mutate_if(is.character, factor)
## Build models
library(tidymodels)
set.seed(1234)
hotel_split <- initial_split(hotels_df)
hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)
hotel_rec <- recipe(children ~ ., data = hotel_train) %>%
step_downsample(children) %>%
step_dummy(all_nominal(), -all_outcomes()) %>%
step_zv(all_numeric()) %>%
step_normalize(all_numeric()) %>%
prep()
test_proc <- bake(hotel_rec, new_data = hotel_test)
knn_spec <- nearest_neighbor() %>%
set_engine('kknn') %>%
set_mode('classification')
knn_fit <- knn_spec %>%
fit(children ~ .,
data=juice(hotel_rec))
knn_fit
## Evaluate models
set.seed(1234)
validation_splits <- mc_cv(juice(hotel_rec), prop = 0.9, strata = children)
validation_splits
## This is where I get the error
knn_res <- fit_resamples(
children ~ .,
knn_spec,
validation_splits,
control = control_resamples(save_pred = TRUE)
)
我的sessionInfo()
:
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GGally_2.1.2.9000 skimr_2.1.3 silgelib_0.1.1 forcats_0.5.1
[5] stringr_1.4.0 readr_1.4.0 tidyverse_1.3.1 knitr_1.33
[9] yardstick_0.0.8 workflowsets_0.0.2 workflows_0.2.2 tune_0.1.5
[13] tidyr_1.1.3 tibble_3.1.2 rsample_0.1.0 recipes_0.1.16
[17] purrr_0.3.4 parsnip_0.1.6 modeldata_0.1.0 infer_0.5.4
[21] ggplot2_3.3.5 dplyr_1.0.7 dials_0.0.9 scales_1.1.1
[25] broom_0.7.6 tidymodels_0.1.3
loaded via a namespace (and not attached):
[1] colorspace_2.0-1 ellipsis_0.3.2 class_7.3-19 base64enc_0.1-3
[5] fs_1.5.0 rstudioapi_0.13 listenv_0.8.0 furrr_0.2.3
[9] farver_2.1.0 prodlim_2019.11.13 fansi_0.5.0 lubridate_1.7.10
[13] xml2_1.3.2 codetools_0.2-18 splines_4.1.0 jsonlite_1.7.2
[17] pROC_1.17.0.1 dbplyr_2.1.1 shiny_1.6.0 compiler_4.1.0
[21] httr_1.4.2 backports_1.2.1 assertthat_0.2.1 Matrix_1.3-3
[25] fastmap_1.1.0 cli_2.5.0 later_1.2.0 htmltools_0.5.1.1
[29] prettyunits_1.1.1 tools_4.1.0 igraph_1.2.6 gtable_0.3.0
[33] glue_1.4.2 Rcpp_1.0.6 cellranger_1.1.0 DiceDesign_1.9
[37] vctrs_0.3.8 iterators_1.0.13 timeDate_3043.102 gower_0.2.2
[41] xfun_0.23 globals_0.14.0 rvest_1.0.0 mime_0.10
[45] lifecycle_1.0.0 kknn_1.3.1 future_1.21.0 MASS_7.3-54
[49] ipred_0.9-11 hms_1.1.0 promises_1.2.0.1 parallel_4.1.0
[53] RColorBrewer_1.1-2 yaml_2.2.1 curl_4.3.1 rpart_4.1-15
[57] reshape_0.8.8 stringi_1.6.2 foreach_1.5.1 lhs_1.1.1
[61] lava_1.6.9 repr_1.1.3 rlang_0.4.11 pkgconfig_2.0.3
[65] evaluate_0.14 lattice_0.20-44 htmlwidgets_1.5.3 labeling_0.4.2
[69] tidyselect_1.1.1 parallelly_1.26.0 plyr_1.8.6 magrittr_2.0.1
[73] R6_2.5.0 generics_0.1.0 DBI_1.1.1 pillar_1.6.1
[77] haven_2.4.1 withr_2.4.2 survival_3.2-11 nnet_7.3-16
[81] modelr_0.1.8 crayon_1.4.1 utf8_1.2.1 rmarkdown_2.8
[85] progress_1.2.2 grid_4.1.0 readxl_1.3.1 reprex_2.0.0
[89] digest_0.6.27 xtable_1.8-4 httpuv_1.6.1 GPfit_1.0-8
[93] munsell_0.5.0
您正在查看的博客 post 相当古老,并且有一个 change to tune a while back 因此您现在应该将工作流或模型放在首位。因此错误消息:
The first argument to [fit_resamples()] should be either a model or workflow.
解决方法是将您的模型或工作流作为第一个参数,如下所示:
knn_res <- fit_resamples(
knn_spec,
children ~ .,
validation_splits,
control = control_resamples(save_pred = TRUE)
)
问题:
我正在学习 Julia Silge (link here) 关于使用 tidymodels 和 recipes 的教程。我可以毫无问题地完成大部分工作,但是当我调用 fit_resamples()
函数时出现错误:Error: The first argument to [fit_resamples()] should be either a model or workflow.
我正在将教程中的代码复制到字符中,一切都运行良好,包括打印出来 validation_splits
。但是,一旦我调用 fit_resamples()
,我就会收到上面的错误 (link to relevant part of tutorial)。如果有用,rlang::last_error()
的输出是:
<error/rlang_error>
The first argument to [fit_resamples()] should be either a model or workflow.
Backtrace:
1. tune::fit_resamples(...)
2. tune:::fit_resamples.default(...)
有人知道这里发生了什么吗?我该如何解决?我的理解是,我传递给 fit_resamples()
的第一个参数是 一个模型,即 character ~ .
,并且我在脚本没有问题。有关导致我的机器和我的 sessionInfo() 错误的代码(和数据),请参见下文。
可重现的例子:
library(tidyverse)
## Bring in data
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')
hotel_stays <- hotels %>%
filter(is_canceled == 0) %>%
mutate(children = case_when(children + babies > 0 ~ 'children',
TRUE ~ 'none'),
required_car_parking_spaces = case_when(required_car_parking_spaces > 0 ~ 'parking',
TRUE ~ 'none')) %>%
select(-is_canceled, -reservation_status, -babies)
hotels_df <- hotel_stays %>%
select(children, hotel, arrival_date_month, meal, adr, adults,
required_car_parking_spaces, total_of_special_requests,
stays_in_week_nights, stays_in_weekend_nights) %>%
mutate_if(is.character, factor)
## Build models
library(tidymodels)
set.seed(1234)
hotel_split <- initial_split(hotels_df)
hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)
hotel_rec <- recipe(children ~ ., data = hotel_train) %>%
step_downsample(children) %>%
step_dummy(all_nominal(), -all_outcomes()) %>%
step_zv(all_numeric()) %>%
step_normalize(all_numeric()) %>%
prep()
test_proc <- bake(hotel_rec, new_data = hotel_test)
knn_spec <- nearest_neighbor() %>%
set_engine('kknn') %>%
set_mode('classification')
knn_fit <- knn_spec %>%
fit(children ~ .,
data=juice(hotel_rec))
knn_fit
## Evaluate models
set.seed(1234)
validation_splits <- mc_cv(juice(hotel_rec), prop = 0.9, strata = children)
validation_splits
## This is where I get the error
knn_res <- fit_resamples(
children ~ .,
knn_spec,
validation_splits,
control = control_resamples(save_pred = TRUE)
)
我的sessionInfo()
:
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GGally_2.1.2.9000 skimr_2.1.3 silgelib_0.1.1 forcats_0.5.1
[5] stringr_1.4.0 readr_1.4.0 tidyverse_1.3.1 knitr_1.33
[9] yardstick_0.0.8 workflowsets_0.0.2 workflows_0.2.2 tune_0.1.5
[13] tidyr_1.1.3 tibble_3.1.2 rsample_0.1.0 recipes_0.1.16
[17] purrr_0.3.4 parsnip_0.1.6 modeldata_0.1.0 infer_0.5.4
[21] ggplot2_3.3.5 dplyr_1.0.7 dials_0.0.9 scales_1.1.1
[25] broom_0.7.6 tidymodels_0.1.3
loaded via a namespace (and not attached):
[1] colorspace_2.0-1 ellipsis_0.3.2 class_7.3-19 base64enc_0.1-3
[5] fs_1.5.0 rstudioapi_0.13 listenv_0.8.0 furrr_0.2.3
[9] farver_2.1.0 prodlim_2019.11.13 fansi_0.5.0 lubridate_1.7.10
[13] xml2_1.3.2 codetools_0.2-18 splines_4.1.0 jsonlite_1.7.2
[17] pROC_1.17.0.1 dbplyr_2.1.1 shiny_1.6.0 compiler_4.1.0
[21] httr_1.4.2 backports_1.2.1 assertthat_0.2.1 Matrix_1.3-3
[25] fastmap_1.1.0 cli_2.5.0 later_1.2.0 htmltools_0.5.1.1
[29] prettyunits_1.1.1 tools_4.1.0 igraph_1.2.6 gtable_0.3.0
[33] glue_1.4.2 Rcpp_1.0.6 cellranger_1.1.0 DiceDesign_1.9
[37] vctrs_0.3.8 iterators_1.0.13 timeDate_3043.102 gower_0.2.2
[41] xfun_0.23 globals_0.14.0 rvest_1.0.0 mime_0.10
[45] lifecycle_1.0.0 kknn_1.3.1 future_1.21.0 MASS_7.3-54
[49] ipred_0.9-11 hms_1.1.0 promises_1.2.0.1 parallel_4.1.0
[53] RColorBrewer_1.1-2 yaml_2.2.1 curl_4.3.1 rpart_4.1-15
[57] reshape_0.8.8 stringi_1.6.2 foreach_1.5.1 lhs_1.1.1
[61] lava_1.6.9 repr_1.1.3 rlang_0.4.11 pkgconfig_2.0.3
[65] evaluate_0.14 lattice_0.20-44 htmlwidgets_1.5.3 labeling_0.4.2
[69] tidyselect_1.1.1 parallelly_1.26.0 plyr_1.8.6 magrittr_2.0.1
[73] R6_2.5.0 generics_0.1.0 DBI_1.1.1 pillar_1.6.1
[77] haven_2.4.1 withr_2.4.2 survival_3.2-11 nnet_7.3-16
[81] modelr_0.1.8 crayon_1.4.1 utf8_1.2.1 rmarkdown_2.8
[85] progress_1.2.2 grid_4.1.0 readxl_1.3.1 reprex_2.0.0
[89] digest_0.6.27 xtable_1.8-4 httpuv_1.6.1 GPfit_1.0-8
[93] munsell_0.5.0
您正在查看的博客 post 相当古老,并且有一个 change to tune a while back 因此您现在应该将工作流或模型放在首位。因此错误消息:
The first argument to [fit_resamples()] should be either a model or workflow.
解决方法是将您的模型或工作流作为第一个参数,如下所示:
knn_res <- fit_resamples(
knn_spec,
children ~ .,
validation_splits,
control = control_resamples(save_pred = TRUE)
)