绘图问题 - 部分依赖图
Plotting issues -Partial dependence plots
创建以下 explain_tidymodels,以显示部分依赖图。
explainer <- explain_tidymodels(rf_vi_fit, data = Data_train, y = Data_train$Lead_week)
现在我正在通过执行以下操作创建地块:
model_profile(explainer, variables = c( "AC", "Jaar, "Month", "Retentie")) %>% plot()
现在我得到以下图像:
问题是,首先,“为工作流模型创建”的文本阻止了我的 AC header。其次,我想将颜色从蓝色更改为红色。我尝试了 %>% plot(color = "red") 和 %>% plot(col = "red"),但两者似乎都不起作用。
有人知道如何解决其中一个绘图问题吗?提前致谢!
您可以使用 as_tibble()
函数访问创建这些绘图的数据,然后您可以按照您喜欢的任何自定义方式创建绘图:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
library(DALEXtra)
#> Loading required package: DALEX
#> Welcome to DALEX (version: 2.2.0).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
#> Additional features will be available after installation of: ggpubr.
#> Use 'install_dependencies()' to get all suggested dependencies
#>
#> Attaching package: 'DALEX'
#> The following object is masked from 'package:dplyr':
#>
#> explain
data(ames)
ames_train <- ames %>%
transmute(Sale_Price = log10(Sale_Price),
Gr_Liv_Area = as.numeric(Gr_Liv_Area),
Year_Built, Bldg_Type)
rf_model <-
rand_forest(trees = 1000) %>%
set_engine("ranger") %>%
set_mode("regression")
rf_wflow <-
workflow() %>%
add_formula(
Sale_Price ~ Gr_Liv_Area + Year_Built + Bldg_Type) %>%
add_model(rf_model)
rf_fit <- rf_wflow %>% fit(data = ames_train)
explainer_rf <- explain_tidymodels(
rf_fit,
data = dplyr::select(ames_train, -Sale_Price),
y = ames_train$Sale_Price,
label = "random forest"
)
#> Preparation of a new explainer is initiated
#> -> model label : random forest
#> -> data : 2930 rows 3 cols
#> -> data : tibble converted into a data.frame
#> -> target variable : 2930 values
#> -> predict function : yhat.workflow will be used ( [33m default [39m )
#> -> predicted values : No value for predict function target column. ( [33m default [39m )
#> -> model_info : package tidymodels , ver. 0.1.3 , task regression ( [33m default [39m )
#> -> predicted values : numerical, min = 4.91122 , mean = 5.220561 , max = 5.520101
#> -> residual function : difference between y and yhat ( [33m default [39m )
#> -> residuals : numerical, min = -0.8113628 , mean = 7.953836e-05 , max = 0.3598514
#> [32m A new explainer has been created! [39m
pdp_rf <- model_profile(explainer_rf, N = NULL,
variables = "Gr_Liv_Area", groups = "Bldg_Type")
as_tibble(pdp_rf$agr_profiles) %>%
mutate(`_label_` = stringr::str_remove(`_label_`, "random forest_")) %>%
ggplot(aes(`_x_`, `_yhat_`, color = `_label_`)) +
geom_line(size = 1.2, alpha = 0.8) +
labs(x = "Gross living area",
y = "Sale Price (log)",
color = NULL,
title = "Partial dependence profile for Ames housing sales",
subtitle = "Predictions from a random forest model")
由 reprex package (v2.0.0)
于 2021-05-27 创建
创建以下 explain_tidymodels,以显示部分依赖图。
explainer <- explain_tidymodels(rf_vi_fit, data = Data_train, y = Data_train$Lead_week)
现在我正在通过执行以下操作创建地块:
model_profile(explainer, variables = c( "AC", "Jaar, "Month", "Retentie")) %>% plot()
现在我得到以下图像:
问题是,首先,“为工作流模型创建”的文本阻止了我的 AC header。其次,我想将颜色从蓝色更改为红色。我尝试了 %>% plot(color = "red") 和 %>% plot(col = "red"),但两者似乎都不起作用。
有人知道如何解决其中一个绘图问题吗?提前致谢!
您可以使用 as_tibble()
函数访问创建这些绘图的数据,然后您可以按照您喜欢的任何自定义方式创建绘图:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
library(DALEXtra)
#> Loading required package: DALEX
#> Welcome to DALEX (version: 2.2.0).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
#> Additional features will be available after installation of: ggpubr.
#> Use 'install_dependencies()' to get all suggested dependencies
#>
#> Attaching package: 'DALEX'
#> The following object is masked from 'package:dplyr':
#>
#> explain
data(ames)
ames_train <- ames %>%
transmute(Sale_Price = log10(Sale_Price),
Gr_Liv_Area = as.numeric(Gr_Liv_Area),
Year_Built, Bldg_Type)
rf_model <-
rand_forest(trees = 1000) %>%
set_engine("ranger") %>%
set_mode("regression")
rf_wflow <-
workflow() %>%
add_formula(
Sale_Price ~ Gr_Liv_Area + Year_Built + Bldg_Type) %>%
add_model(rf_model)
rf_fit <- rf_wflow %>% fit(data = ames_train)
explainer_rf <- explain_tidymodels(
rf_fit,
data = dplyr::select(ames_train, -Sale_Price),
y = ames_train$Sale_Price,
label = "random forest"
)
#> Preparation of a new explainer is initiated
#> -> model label : random forest
#> -> data : 2930 rows 3 cols
#> -> data : tibble converted into a data.frame
#> -> target variable : 2930 values
#> -> predict function : yhat.workflow will be used ( [33m default [39m )
#> -> predicted values : No value for predict function target column. ( [33m default [39m )
#> -> model_info : package tidymodels , ver. 0.1.3 , task regression ( [33m default [39m )
#> -> predicted values : numerical, min = 4.91122 , mean = 5.220561 , max = 5.520101
#> -> residual function : difference between y and yhat ( [33m default [39m )
#> -> residuals : numerical, min = -0.8113628 , mean = 7.953836e-05 , max = 0.3598514
#> [32m A new explainer has been created! [39m
pdp_rf <- model_profile(explainer_rf, N = NULL,
variables = "Gr_Liv_Area", groups = "Bldg_Type")
as_tibble(pdp_rf$agr_profiles) %>%
mutate(`_label_` = stringr::str_remove(`_label_`, "random forest_")) %>%
ggplot(aes(`_x_`, `_yhat_`, color = `_label_`)) +
geom_line(size = 1.2, alpha = 0.8) +
labs(x = "Gross living area",
y = "Sale Price (log)",
color = NULL,
title = "Partial dependence profile for Ames housing sales",
subtitle = "Predictions from a random forest model")
由 reprex package (v2.0.0)
于 2021-05-27 创建