您如何重复线性回归,其中只有 IV 发生变化而不必重复编写代码?
How do you repeat linear regressions where only the IV changes without having to write code repeatedly?
如何重复线性回归,其中只有 IV 发生变化而无需重复编写代码?
我在 MacOS 11.1 上安装了 RStudio。我想使用 mtcars
数据集获得 8 个线性回归的 summary()
结果,DV 为 mpg
,并且每个回归模型中的 1 IV 不同。感兴趣的 IV 是 am
、cyl
、disp
、hp
、drat
、wt
、qsec
和 vs
.
我可以每次针对不同的回归重写代码,但这似乎是完成此任务的漫长道路。
例如,这将是第一个回归的代码:
lm__am_on_mpg__mtcars <- lm(mpg ~ am, data=mtcars)
summary(lm__am_on_mpg__mtcars)
这将是第二次回归的代码:
lm__cyl_on_mpg__mtcars <- lm(mpg ~ cyl, data=mtcars)
summary(lm__cyl_on_mpg__mtcars)
但我必须多次执行此操作,而且似乎可以有一种更简洁的方法来执行此操作。
以下是我的问题:(1) 这可以在 R 中实现吗? (1a) 如果是这样,将如何完成?
====================
这是我用来长期完成此任务的 R 代码:
# How do you repeat linear regressions where only the IV changes without having to write code repeatedly?
## dataset of interest
mtcars
### info about dataset
head(mtcars)
str(mtcars)
columns(mtcars)
## variables of interets
unique(mtcars$mpg)
# ---- NOTE: DV is mpg
unique(mtcars$am)
# ---- NOTE: IV is mpg
unique(mtcars$cyl)
unique(mtcars$disp)
unique(mtcars$hp)
unique(mtcars$drat)
unique(mtcars$wt)
unique(mtcars$qsec)
unique(mtcars$vs)
# ---- NOTE: other IVs of interest
## first linear regression
lm__am_on_mpg__mtcars <- lm(mpg ~ am, data=mtcars)
summary(lm__am_on_mpg__mtcars)
## linear regressions for the other IVs
### IV is cyl
lm__cyl_on_mpg__mtcars <- lm(mpg ~ cyl, data=mtcars)
summary(lm__cyl_on_mpg__mtcars)
### IV is disp
lm__disp_on_mpg__mtcars <- lm(mpg ~ disp, data=mtcars)
summary(lm__disp_on_mpg__mtcars)
### IV is hp
lm__hp_on_mpg__mtcars <- lm(mpg ~ hp, data=mtcars)
summary(lm__hp_on_mpg__mtcars)
### IV is drat
lm__drat_on_mpg__mtcars <- lm(mpg ~ drat, data=mtcars)
summary(lm__drat_on_mpg__mtcars)
### IV is wt
lm__wt_on_mpg__mtcars <- lm(mpg ~ wt, data=mtcars)
summary(lm__wt_on_mpg__mtcars)
### IV is qsec
lm__qsec_on_mpg__mtcars <- lm(mpg ~ qsec, data=mtcars)
summary(lm__qsec_on_mpg__mtcars)
### IV is vs
lm__vs_on_mpg__mtcars <- lm(mpg ~ vs, data=mtcars)
summary(lm__vs_on_mpg__mtcars)
====================
编辑 1:
根据评论者的建议,我尝试将 ExhaustiveSearch 包安装到 MacOS 上的 RStudio 控制台上,但没有成功。下面是控制台结果,仅供参考。
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) yes
installing the source package ‘ExhaustiveSearch’
trying URL 'https://cran.rstudio.com/src/contrib/ExhaustiveSearch_1.0.0.tar.gz'
Content type 'application/x-gzip' length 52611 bytes (51 KB)
==================================================
downloaded 51 KB
* installing *source* package ‘ExhaustiveSearch’ ...
** package ‘ExhaustiveSearch’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c Combination.cpp -o Combination.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c ExhaustiveSearchCpp.cpp -o ExhaustiveSearchCpp.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c GLM.cpp -o GLM.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c SearchTask.cpp -o SearchTask.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c lbfgs.c -o lbfgs.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o ExhaustiveSearch.so Combination.o ExhaustiveSearchCpp.o GLM.o RcppExports.o SearchTask.o lbfgs.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0'
ld: warning: directory not found for option '-L/usr/local/gfortran/lib'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [ExhaustiveSearch.so] Error 1
ERROR: compilation failed for package ‘ExhaustiveSearch’
* removing ‘/Library/Frameworks/R.framework/Versions/4.0/Resources/library/ExhaustiveSearch’
Warning in install.packages :
installation of package ‘ExhaustiveSearch’ had non-zero exit status
The downloaded source packages are in
‘/private/var/folders/w_/fc17blzn66v_j8xbm24vpmvm0000gn/T/RtmppjIa4V/downloaded_packages’
>
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) No
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
>
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) cancel
Error in install.packages : Cancelled by user
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
1)使用update
修改第一个拟合:
fm.am <- lm(mpg ~ am, mtcars)
fm.cyl <- update(fm.am, ~ cyl)
2) 如果要遍历所有列,请使用 reformulate 为每次迭代构造适当的公式。这里 L 设置为 lm 对象列表,每个 运行 一个。例如,L$cyl 是 lm 输出 mpg ~ cyl.
在下面的代码中,我们可以使用更简单的 lmfun 注释版本,但如果我们这样做,输出看起来不会那么好。
# lmfun <- function(x) lm(reformulate(x, "mpg"), mtcars)
lmfun <- function(x) do.call("lm", list(reformulate(x, "mpg"), quote(mtcars)))
L <- Map(lmfun, names(mtcars)[-1])
3) 如果要确定哪个预测变量最好,请尝试:
library(ExhaustiveSearch)
ExhaustiveSearch(mpg ~., mtcars, family = "gaussian", combsUpTo = 1)>
或者,如果您想找出哪对最有效,请使用 comboUpTo=2 等。以上代码的输出如下所示。请注意,AIC 越小越好。
Starting the exhaustive evaluation.
Runtime | Completed | Status
--------------------------------------
--------------------------------------
Evaluation finished successfully.
+-------------------------------------------------+
| Exhaustive Search Results |
+-------------------------------------------------+
Model family: gaussian
Intercept: TRUE
Performance measure: AIC
Models fitted on: training set (n = 32)
Models evaluated on: training set (n = 32)
Models evaluated: 10
Models saved: 10
Total runtime: 00d 00h 00m 09s
Number of threads: 4
+-------------------------------------------------+
| Top Feature Sets |
+-------------------------------------------------+
AIC Combination
1 166.0294 wt
2 169.3064 cyl
3 170.2094 disp
4 181.2386 hp
5 190.7999 drat
使用 lapply 和 as.formula 为每个 lm 对象创建一个列表
library(tidyverse)
data_df <- mtcars %>% as_tibble()
##Target variables
target_vars <- c("cyl","disp","hp",
"drat","wt",
"qsec","vs")
##Write lapply and save summary objects
list_with_lm_objects <- lapply(target_vars, function(i){
lm_object <- lm(as.formula(paste0("mpg ~",i)),data = data_df)
summary_of_lm <- summary(lm_object)
return(summary_of_lm)
})
z
如何重复线性回归,其中只有 IV 发生变化而无需重复编写代码?
我在 MacOS 11.1 上安装了 RStudio。我想使用 mtcars
数据集获得 8 个线性回归的 summary()
结果,DV 为 mpg
,并且每个回归模型中的 1 IV 不同。感兴趣的 IV 是 am
、cyl
、disp
、hp
、drat
、wt
、qsec
和 vs
.
我可以每次针对不同的回归重写代码,但这似乎是完成此任务的漫长道路。
例如,这将是第一个回归的代码:
lm__am_on_mpg__mtcars <- lm(mpg ~ am, data=mtcars)
summary(lm__am_on_mpg__mtcars)
这将是第二次回归的代码:
lm__cyl_on_mpg__mtcars <- lm(mpg ~ cyl, data=mtcars)
summary(lm__cyl_on_mpg__mtcars)
但我必须多次执行此操作,而且似乎可以有一种更简洁的方法来执行此操作。
以下是我的问题:(1) 这可以在 R 中实现吗? (1a) 如果是这样,将如何完成?
====================
这是我用来长期完成此任务的 R 代码:
# How do you repeat linear regressions where only the IV changes without having to write code repeatedly?
## dataset of interest
mtcars
### info about dataset
head(mtcars)
str(mtcars)
columns(mtcars)
## variables of interets
unique(mtcars$mpg)
# ---- NOTE: DV is mpg
unique(mtcars$am)
# ---- NOTE: IV is mpg
unique(mtcars$cyl)
unique(mtcars$disp)
unique(mtcars$hp)
unique(mtcars$drat)
unique(mtcars$wt)
unique(mtcars$qsec)
unique(mtcars$vs)
# ---- NOTE: other IVs of interest
## first linear regression
lm__am_on_mpg__mtcars <- lm(mpg ~ am, data=mtcars)
summary(lm__am_on_mpg__mtcars)
## linear regressions for the other IVs
### IV is cyl
lm__cyl_on_mpg__mtcars <- lm(mpg ~ cyl, data=mtcars)
summary(lm__cyl_on_mpg__mtcars)
### IV is disp
lm__disp_on_mpg__mtcars <- lm(mpg ~ disp, data=mtcars)
summary(lm__disp_on_mpg__mtcars)
### IV is hp
lm__hp_on_mpg__mtcars <- lm(mpg ~ hp, data=mtcars)
summary(lm__hp_on_mpg__mtcars)
### IV is drat
lm__drat_on_mpg__mtcars <- lm(mpg ~ drat, data=mtcars)
summary(lm__drat_on_mpg__mtcars)
### IV is wt
lm__wt_on_mpg__mtcars <- lm(mpg ~ wt, data=mtcars)
summary(lm__wt_on_mpg__mtcars)
### IV is qsec
lm__qsec_on_mpg__mtcars <- lm(mpg ~ qsec, data=mtcars)
summary(lm__qsec_on_mpg__mtcars)
### IV is vs
lm__vs_on_mpg__mtcars <- lm(mpg ~ vs, data=mtcars)
summary(lm__vs_on_mpg__mtcars)
====================
编辑 1:
根据评论者的建议,我尝试将 ExhaustiveSearch 包安装到 MacOS 上的 RStudio 控制台上,但没有成功。下面是控制台结果,仅供参考。
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) yes
installing the source package ‘ExhaustiveSearch’
trying URL 'https://cran.rstudio.com/src/contrib/ExhaustiveSearch_1.0.0.tar.gz'
Content type 'application/x-gzip' length 52611 bytes (51 KB)
==================================================
downloaded 51 KB
* installing *source* package ‘ExhaustiveSearch’ ...
** package ‘ExhaustiveSearch’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c Combination.cpp -o Combination.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c ExhaustiveSearchCpp.cpp -o ExhaustiveSearchCpp.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c GLM.cpp -o GLM.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c SearchTask.cpp -o SearchTask.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo/include' -I/usr/local/include -fPIC -Wall -g -O2 -c lbfgs.c -o lbfgs.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o ExhaustiveSearch.so Combination.o ExhaustiveSearchCpp.o GLM.o RcppExports.o SearchTask.o lbfgs.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0'
ld: warning: directory not found for option '-L/usr/local/gfortran/lib'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [ExhaustiveSearch.so] Error 1
ERROR: compilation failed for package ‘ExhaustiveSearch’
* removing ‘/Library/Frameworks/R.framework/Versions/4.0/Resources/library/ExhaustiveSearch’
Warning in install.packages :
installation of package ‘ExhaustiveSearch’ had non-zero exit status
The downloaded source packages are in
‘/private/var/folders/w_/fc17blzn66v_j8xbm24vpmvm0000gn/T/RtmppjIa4V/downloaded_packages’
>
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) No
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
>
> install.packages("ExhaustiveSearch")
Package which is only available in source form, and may need compilation of C/C++/Fortran:
‘ExhaustiveSearch’
Do you want to attempt to install these from sources? (Yes/no/cancel) cancel
Error in install.packages : Cancelled by user
> library(ExhaustiveSearch)
Error in library(ExhaustiveSearch) :
there is no package called ‘ExhaustiveSearch’
1)使用update
修改第一个拟合:
fm.am <- lm(mpg ~ am, mtcars)
fm.cyl <- update(fm.am, ~ cyl)
2) 如果要遍历所有列,请使用 reformulate 为每次迭代构造适当的公式。这里 L 设置为 lm 对象列表,每个 运行 一个。例如,L$cyl 是 lm 输出 mpg ~ cyl.
在下面的代码中,我们可以使用更简单的 lmfun 注释版本,但如果我们这样做,输出看起来不会那么好。
# lmfun <- function(x) lm(reformulate(x, "mpg"), mtcars)
lmfun <- function(x) do.call("lm", list(reformulate(x, "mpg"), quote(mtcars)))
L <- Map(lmfun, names(mtcars)[-1])
3) 如果要确定哪个预测变量最好,请尝试:
library(ExhaustiveSearch)
ExhaustiveSearch(mpg ~., mtcars, family = "gaussian", combsUpTo = 1)>
或者,如果您想找出哪对最有效,请使用 comboUpTo=2 等。以上代码的输出如下所示。请注意,AIC 越小越好。
Starting the exhaustive evaluation.
Runtime | Completed | Status
--------------------------------------
--------------------------------------
Evaluation finished successfully.
+-------------------------------------------------+
| Exhaustive Search Results |
+-------------------------------------------------+
Model family: gaussian
Intercept: TRUE
Performance measure: AIC
Models fitted on: training set (n = 32)
Models evaluated on: training set (n = 32)
Models evaluated: 10
Models saved: 10
Total runtime: 00d 00h 00m 09s
Number of threads: 4
+-------------------------------------------------+
| Top Feature Sets |
+-------------------------------------------------+
AIC Combination
1 166.0294 wt
2 169.3064 cyl
3 170.2094 disp
4 181.2386 hp
5 190.7999 drat
使用 lapply 和 as.formula 为每个 lm 对象创建一个列表
library(tidyverse)
data_df <- mtcars %>% as_tibble()
##Target variables
target_vars <- c("cyl","disp","hp",
"drat","wt",
"qsec","vs")
##Write lapply and save summary objects
list_with_lm_objects <- lapply(target_vars, function(i){
lm_object <- lm(as.formula(paste0("mpg ~",i)),data = data_df)
summary_of_lm <- summary(lm_object)
return(summary_of_lm)
})
z