食谱包无法在 step_interact 中创建交互项
recipes package cannot create interaction term in step_interact
我正在使用医疗保险数据集来磨练我的建模技能,如下所示:
> insur_dt
age sex bmi children smoker region charges
1: 19 female 27.900 0 yes southwest 16884.924
2: 18 male 33.770 1 no southeast 1725.552
3: 28 male 33.000 3 no southeast 4449.462
4: 33 male 22.705 0 no northwest 21984.471
5: 32 male 28.880 0 no northwest 3866.855
---
1334: 50 male 30.970 3 no northwest 10600.548
1335: 18 female 31.920 0 no northeast 2205.981
1336: 18 female 36.850 0 no southeast 1629.833
1337: 21 female 25.800 0 no southwest 2007.945
1338: 61 female 29.070 0 yes northwest 29141.360
我正在使用 recipes
作为 tidymodels
元包的一部分来准备我的数据以用于模型,并且我确定 bmi
、age
, 和 smoker
形成交互项。
insur_split <- initial_split(insur_dt)
insur_train <- training(insur_split)
insur_test <- testing(insur_split)
# we are going to do data processing and feature engineering with recipes
# below, we are going to predict charges using everything else(".")
insur_rec <- recipe(charges ~ age + bmi + smoker, data = insur_train) %>%
step_dummy(all_nominal()) %>%
step_zv(all_numeric()) %>%
step_normalize(all_numeric()) %>%
step_interact(~ bmi:smoker:age) %>%
prep()
根据 tidymodels guide/documentation,我必须将交互指定为 recipe
中的一个步骤 step_interact
。但是,当我尝试这样做时出现错误:
> insur_rec <- recipe(charges ~ age + bmi + smoker, data = insur_train) %>%
+ step_dummy(all_nominal()) %>%
+ step_zv(all_numeric()) %>%
+ step_normalize(all_numeric()) %>%
+ step_interact(~ bmi:smoker:age) %>%
+ prep()
Interaction specification failed for: ~bmi:smoker:age. No interactions will be created.partial match of 'object' to 'objects'
我是建模新手,不太清楚为什么会出现此错误。我只是想说明 charges
由所有其他预测变量解释,并且 smoker
(yes/no 因子)、age
(数字)和 bmi
(double) 都相互交互以告知结果。我做错了什么?
step_interact
can create interactions between variables. It is primarily intended for numeric data; categorical variables should probably be converted to dummy variables using step_dummy()
prior to being used for interactions.
step_dummy(all_nominal())
把变量smoker
变成了smoker_yes
。下面,您会看到我刚刚将交互项中的 smoker
名称更改为 smoker_yes
.
insur_rec <- recipe(charges ~ bmi + age + smoker, data = insur_train) %>%
step_dummy(all_nominal()) %>%
step_normalize(all_numeric(), -all_outcomes()) %>%
step_interact(terms = ~ bmi:age:smoker_yes) %>%
prep(verbose = TRUE, log_changes = TRUE)
我正在使用医疗保险数据集来磨练我的建模技能,如下所示:
> insur_dt
age sex bmi children smoker region charges
1: 19 female 27.900 0 yes southwest 16884.924
2: 18 male 33.770 1 no southeast 1725.552
3: 28 male 33.000 3 no southeast 4449.462
4: 33 male 22.705 0 no northwest 21984.471
5: 32 male 28.880 0 no northwest 3866.855
---
1334: 50 male 30.970 3 no northwest 10600.548
1335: 18 female 31.920 0 no northeast 2205.981
1336: 18 female 36.850 0 no southeast 1629.833
1337: 21 female 25.800 0 no southwest 2007.945
1338: 61 female 29.070 0 yes northwest 29141.360
我正在使用 recipes
作为 tidymodels
元包的一部分来准备我的数据以用于模型,并且我确定 bmi
、age
, 和 smoker
形成交互项。
insur_split <- initial_split(insur_dt)
insur_train <- training(insur_split)
insur_test <- testing(insur_split)
# we are going to do data processing and feature engineering with recipes
# below, we are going to predict charges using everything else(".")
insur_rec <- recipe(charges ~ age + bmi + smoker, data = insur_train) %>%
step_dummy(all_nominal()) %>%
step_zv(all_numeric()) %>%
step_normalize(all_numeric()) %>%
step_interact(~ bmi:smoker:age) %>%
prep()
根据 tidymodels guide/documentation,我必须将交互指定为 recipe
中的一个步骤 step_interact
。但是,当我尝试这样做时出现错误:
> insur_rec <- recipe(charges ~ age + bmi + smoker, data = insur_train) %>%
+ step_dummy(all_nominal()) %>%
+ step_zv(all_numeric()) %>%
+ step_normalize(all_numeric()) %>%
+ step_interact(~ bmi:smoker:age) %>%
+ prep()
Interaction specification failed for: ~bmi:smoker:age. No interactions will be created.partial match of 'object' to 'objects'
我是建模新手,不太清楚为什么会出现此错误。我只是想说明 charges
由所有其他预测变量解释,并且 smoker
(yes/no 因子)、age
(数字)和 bmi
(double) 都相互交互以告知结果。我做错了什么?
step_interact
can create interactions between variables. It is primarily intended for numeric data; categorical variables should probably be converted to dummy variables usingstep_dummy()
prior to being used for interactions.
step_dummy(all_nominal())
把变量smoker
变成了smoker_yes
。下面,您会看到我刚刚将交互项中的 smoker
名称更改为 smoker_yes
.
insur_rec <- recipe(charges ~ bmi + age + smoker, data = insur_train) %>%
step_dummy(all_nominal()) %>%
step_normalize(all_numeric(), -all_outcomes()) %>%
step_interact(terms = ~ bmi:age:smoker_yes) %>%
prep(verbose = TRUE, log_changes = TRUE)