在 R tidymodels 中,我如何指定特定变量的对比?
In R tidymodels how can I specify contrasts for specific variables?
我想使用 tidymodels
配方为 LM 中的两个预测变量指定“和为零”的对比。可能吗?查看 recipes
文档,在 1.3 之前,似乎有人尝试构建特定于变量的选项,但该策略已转移到全局选项。
我正在尝试将此基本 R 代码转换为 tidymodels
:
Bikeshare <- ISLR2::Bikeshare # start with original data
contrasts(Bikeshare$hr) <- contr.sum(24)
contrasts(Bikeshare$mnth) <- contr.sum(12)
mod.lm2 <-
lm(
bikers ~ mnth + hr + workingday + temp + weathersit,
data = Bikeshare
)
summary(mod.lm2)
我走到这一步:
library(tidymodels)
Bikeshare <- ISLR2::Bikeshare # start with original data
contrasts(Bikeshare$hr) <- contr.sum(24)
contrasts(Bikeshare$mnth) <- contr.sum(12)
lm_spec <- linear_reg() %>%
set_engine("lm")
the_rec <-
recipe(
bikers ~ mnth + hr + workingday + temp + weathersit,
data = Bikeshare
) %>%
step_dummy(c(mnth, hr), one_hot = TRUE)
the_workflow<- workflow() %>%
add_recipe(the_rec) %>%
add_model(lm_spec)
the_workflow_fit_lm_fit <-
fit(the_workflow, data = Bikeshare) %>%
extract_fit_parsnip()
summary(the_workflow_fit_lm_fit$fit)
有人知道如何从 tidymodels
工作流程中获得相同的结果吗?
我不认为我可以使用 contr.sum 作为全局选项。这为我提供了我想要的两个变量的贝塔值,但它改变了其他变量的对比。
BikeShare <- ISLR2::Bikeshare # be sure to work with original data ;
old_opt <- options()$contrast;
options(contrasts = c('contr.sum', 'contr.poly'))
step_dummy()
的文档有:
To change the type of contrast being used, change the global contrast option via options
.
所以除了全局选项之外,没有办法改变它。
虽然我们应该有一个例子 :-/
请注意,对于新样本,将再次从全局选项中读取选项。确保它们在预测时设置相同:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
tidymodels_prefer()
data("penguins")
penguins <-
penguins %>%
distinct(species)
# R's defaults
old_opt <- options()$contrast
old_opt
#> unordered ordered
#> "contr.treatment" "contr.poly"
# default contrast
default <-
recipe(~ species, data = penguins) %>%
step_dummy(species) %>%
prep()
default %>% bake(new_data = NULL)
#> # A tibble: 3 × 2
#> species_Chinstrap species_Gentoo
#> <dbl> <dbl>
#> 1 0 0
#> 2 0 1
#> 3 1 0
# Do do something different
# Now set to something else:
options(contrasts = c('contr.sum', 'contr.poly'))
with_opt <-
recipe(~ species, data = penguins) %>%
step_dummy(species) %>%
prep()
with_opt %>% bake(new_data = NULL)
#> # A tibble: 3 × 2
#> species_X1 species_X2
#> <dbl> <dbl>
#> 1 1 0
#> 2 -1 -1
#> 3 0 1
# reset options:
options(contrasts = old_opt)
with_opt %>% bake(new_data = penguins)
#> # A tibble: 3 × 2
#> species_Chinstrap species_Gentoo
#> <dbl> <dbl>
#> 1 0 0
#> 2 0 1
#> 3 1 0
由 reprex package (v2.0.0)
于 2021-11-16 创建
为清楚起见编辑
我想使用 tidymodels
配方为 LM 中的两个预测变量指定“和为零”的对比。可能吗?查看 recipes
文档,在 1.3 之前,似乎有人尝试构建特定于变量的选项,但该策略已转移到全局选项。
我正在尝试将此基本 R 代码转换为 tidymodels
:
Bikeshare <- ISLR2::Bikeshare # start with original data
contrasts(Bikeshare$hr) <- contr.sum(24)
contrasts(Bikeshare$mnth) <- contr.sum(12)
mod.lm2 <-
lm(
bikers ~ mnth + hr + workingday + temp + weathersit,
data = Bikeshare
)
summary(mod.lm2)
我走到这一步:
library(tidymodels)
Bikeshare <- ISLR2::Bikeshare # start with original data
contrasts(Bikeshare$hr) <- contr.sum(24)
contrasts(Bikeshare$mnth) <- contr.sum(12)
lm_spec <- linear_reg() %>%
set_engine("lm")
the_rec <-
recipe(
bikers ~ mnth + hr + workingday + temp + weathersit,
data = Bikeshare
) %>%
step_dummy(c(mnth, hr), one_hot = TRUE)
the_workflow<- workflow() %>%
add_recipe(the_rec) %>%
add_model(lm_spec)
the_workflow_fit_lm_fit <-
fit(the_workflow, data = Bikeshare) %>%
extract_fit_parsnip()
summary(the_workflow_fit_lm_fit$fit)
有人知道如何从 tidymodels
工作流程中获得相同的结果吗?
我不认为我可以使用 contr.sum 作为全局选项。这为我提供了我想要的两个变量的贝塔值,但它改变了其他变量的对比。
BikeShare <- ISLR2::Bikeshare # be sure to work with original data ;
old_opt <- options()$contrast;
options(contrasts = c('contr.sum', 'contr.poly'))
step_dummy()
的文档有:
To change the type of contrast being used, change the global contrast option via
options
.
所以除了全局选项之外,没有办法改变它。
虽然我们应该有一个例子 :-/
请注意,对于新样本,将再次从全局选项中读取选项。确保它们在预测时设置相同:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
tidymodels_prefer()
data("penguins")
penguins <-
penguins %>%
distinct(species)
# R's defaults
old_opt <- options()$contrast
old_opt
#> unordered ordered
#> "contr.treatment" "contr.poly"
# default contrast
default <-
recipe(~ species, data = penguins) %>%
step_dummy(species) %>%
prep()
default %>% bake(new_data = NULL)
#> # A tibble: 3 × 2
#> species_Chinstrap species_Gentoo
#> <dbl> <dbl>
#> 1 0 0
#> 2 0 1
#> 3 1 0
# Do do something different
# Now set to something else:
options(contrasts = c('contr.sum', 'contr.poly'))
with_opt <-
recipe(~ species, data = penguins) %>%
step_dummy(species) %>%
prep()
with_opt %>% bake(new_data = NULL)
#> # A tibble: 3 × 2
#> species_X1 species_X2
#> <dbl> <dbl>
#> 1 1 0
#> 2 -1 -1
#> 3 0 1
# reset options:
options(contrasts = old_opt)
with_opt %>% bake(new_data = penguins)
#> # A tibble: 3 × 2
#> species_Chinstrap species_Gentoo
#> <dbl> <dbl>
#> 1 0 0
#> 2 0 1
#> 3 1 0
由 reprex package (v2.0.0)
于 2021-11-16 创建为清楚起见编辑