从带系数 (R) 的 RIDGE、LASSO 和净弹性回归中为许多变量提取公式
Extract Formula From RIDGE, LASSO, and Net Elastic Regression with Coefficients (R) for many variables
我正在尝试修改我在这个 post 的一个答案中找到的一些代码:
Extract Formula From lm with Coefficients (R)
AlexB 提供了这些精彩的代码行:
get_formula <- function(model) {
broom::tidy(model)[, 1:2] %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = '')) %>%
as.character
}
虽然这适用于我的一些代码,但我在使用 RIDGE、LASSO 和 Net Elastic Regression 调整它以从 glmnet 模型打印公式时遇到问题。
下面附上了我要提供的示例:
library(caret)
library(glmnet)
library(mlbench)
library(psych)
data("BostonHousing")
data <- BostonHousing
set.seed(23)
ind <- sample(2, nrow(data), replace = T, prob = c(0.7, 0.3))
train <- data[ind==1,]
test <- data[ind==2,]
custom <- trainControl(method = "repeatedcv",number = 10,repeats = 5,verboseIter = T)
set.seed(23)
ridge <- train(medv~., train,method = "glmnet",tuneGrid = expand.grid(alpha = 0,lambda = seq(0.0001,1,length = 5)),trControl = custom)
ridge
coef(ridge$finalModel, ridge$bestTune$lambda) # the coefficient estimates
get_formula <- function(model) {
broom::tidy(model)[, 1:2] %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = '')) %>%
as.character
}
get_formula(ridge$finalModel)
但是,鉴于它与之前的格式不同 post,我在修改函数时遇到问题,以便它可以打印出我正在寻找的方程式。
给出错误:
Error: Problem with `mutate()` input `sign`.
x object 'estimate' not found
i Input `sign` is `ifelse(sign(estimate) == 1, " + ", " - ")`.
Run `rlang::last_error()` to see where the error occurred.
感谢您的帮助。
broom
包有 a tidy
variant for glmnet
- 您不需要使用 [, 1:2]
.
索引整理数据
只需使用 tidy(model)
管道的其余部分就可以正常工作。
这里是函数的关键部分,拿出来演示一下:
broom::tidy(ridge$finalModel) %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term)))
# A tibble: 1,400 x 7
term step estimate lambda dev.ratio sign a
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 (Intercept) 1 21.7 6655. 0 " + " y ~ 21.68
2 (Intercept) 2 21.7 6064. 0.01 " + " y ~ 21.73
3 (Intercept) 3 21.7 5525. 0.01 " + " y ~ 21.73
4 (Intercept) 4 21.7 5034. 0.01 " + " y ~ 21.74
5 (Intercept) 5 21.7 4587. 0.01 " + " y ~ 21.74
6 (Intercept) 6 21.8 4180. 0.01 " + " y ~ 21.75
7 (Intercept) 7 21.8 3808. 0.01 " + " y ~ 21.75
8 (Intercept) 8 21.8 3470. 0.01 " + " y ~ 21.76
9 (Intercept) 9 21.8 3162. 0.01 " + " y ~ 21.77
10 (Intercept) 10 21.8 2881. 0.02 " + " y ~ 21.78
# … with 1,390 more rows
小提示:across
现在可以替换 mutate_if
,例如
mutate(across(where(is.numeric), ~abs(round(., 2))))
稍作更新,即可得到岭回归方程,如下:
as.matrix(coef(ridge$finalModel, ridge$bestTune$lambda)) %>%
as.data.frame() %>%
tibble::rownames_to_column('term') %>%
rename(estimate = 2) %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate(across(where(is.numeric), ~abs(round(., 2)))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = ''))
我正在尝试修改我在这个 post 的一个答案中找到的一些代码:
Extract Formula From lm with Coefficients (R)
AlexB 提供了这些精彩的代码行:
get_formula <- function(model) {
broom::tidy(model)[, 1:2] %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = '')) %>%
as.character
}
虽然这适用于我的一些代码,但我在使用 RIDGE、LASSO 和 Net Elastic Regression 调整它以从 glmnet 模型打印公式时遇到问题。
下面附上了我要提供的示例:
library(caret)
library(glmnet)
library(mlbench)
library(psych)
data("BostonHousing")
data <- BostonHousing
set.seed(23)
ind <- sample(2, nrow(data), replace = T, prob = c(0.7, 0.3))
train <- data[ind==1,]
test <- data[ind==2,]
custom <- trainControl(method = "repeatedcv",number = 10,repeats = 5,verboseIter = T)
set.seed(23)
ridge <- train(medv~., train,method = "glmnet",tuneGrid = expand.grid(alpha = 0,lambda = seq(0.0001,1,length = 5)),trControl = custom)
ridge
coef(ridge$finalModel, ridge$bestTune$lambda) # the coefficient estimates
get_formula <- function(model) {
broom::tidy(model)[, 1:2] %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = '')) %>%
as.character
}
get_formula(ridge$finalModel)
但是,鉴于它与之前的格式不同 post,我在修改函数时遇到问题,以便它可以打印出我正在寻找的方程式。
给出错误:
Error: Problem with `mutate()` input `sign`.
x object 'estimate' not found
i Input `sign` is `ifelse(sign(estimate) == 1, " + ", " - ")`.
Run `rlang::last_error()` to see where the error occurred.
感谢您的帮助。
broom
包有 a tidy
variant for glmnet
- 您不需要使用 [, 1:2]
.
只需使用 tidy(model)
管道的其余部分就可以正常工作。
这里是函数的关键部分,拿出来演示一下:
broom::tidy(ridge$finalModel) %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate_if(is.numeric, ~ abs(round(., 2))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term)))
# A tibble: 1,400 x 7
term step estimate lambda dev.ratio sign a
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 (Intercept) 1 21.7 6655. 0 " + " y ~ 21.68
2 (Intercept) 2 21.7 6064. 0.01 " + " y ~ 21.73
3 (Intercept) 3 21.7 5525. 0.01 " + " y ~ 21.73
4 (Intercept) 4 21.7 5034. 0.01 " + " y ~ 21.74
5 (Intercept) 5 21.7 4587. 0.01 " + " y ~ 21.74
6 (Intercept) 6 21.8 4180. 0.01 " + " y ~ 21.75
7 (Intercept) 7 21.8 3808. 0.01 " + " y ~ 21.75
8 (Intercept) 8 21.8 3470. 0.01 " + " y ~ 21.76
9 (Intercept) 9 21.8 3162. 0.01 " + " y ~ 21.77
10 (Intercept) 10 21.8 2881. 0.02 " + " y ~ 21.78
# … with 1,390 more rows
小提示:across
现在可以替换 mutate_if
,例如
mutate(across(where(is.numeric), ~abs(round(., 2))))
稍作更新,即可得到岭回归方程,如下:
as.matrix(coef(ridge$finalModel, ridge$bestTune$lambda)) %>%
as.data.frame() %>%
tibble::rownames_to_column('term') %>%
rename(estimate = 2) %>%
mutate(sign = ifelse(sign(estimate) == 1, ' + ', ' - ')) %>% #coeff signs
mutate(across(where(is.numeric), ~abs(round(., 2)))) %>% #for improving formatting
mutate(a = ifelse(term == '(Intercept)', paste0('y ~ ', estimate), paste0(sign, estimate, ' * ', term))) %>%
summarise(formula = paste(a, collapse = ''))