如何在一个循环中拟合多个交互模型?
How to fit multiple interaction models in a loop?
假设我有 3 个响应变量 A、C 和 M,我想为所有可能的模型拟合一个模型,即拟合 Y ~ A、Y ~ C、Y ~ M、Y ~ A * C、Y ~ A * M, Y ~ C * M, etc. 有没有一种快速的方法可以不用每次都手动指定交互?
不想写
M1 = glm(Y ~ A , data = subs, family = "poisson")
M2 = glm(Y ~ C , data = subs, family = "poisson")
M3 = glm(Y ~ M , data = subs, family = "poisson")
M4 = glm(Y ~ A*C , data = subs, family = "poisson")
...
实际上我有 3 个以上的变量并且想要某种循环,这是否可能。
谢谢
将所有 RHS 变量放入一个 v
向量中,并使用 combn
获得一和二的组合(使用 lapply
)。我们用 reformulate
和 paste
得到的公式用 *
折叠。在 combn
之后,我们需要另一个 lapply
来循环组合。
v <- c("X1", "X2", "X3")
res <- lapply(1:2, function(n) {
.env <- environment()
cb <- combn(c("X1", "X2", "X3"), n, function(x) paste(x, collapse=" * "))
lapply(cb, function(cb) lm(reformulate(cb, "y", env=.env), dat))
})
结果
res
# [[1]]
# [[1]][[1]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1
# -0.3433 0.3382
#
#
# [[1]][[2]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X2
# 0.008104 1.017076
#
#
# [[1]][[3]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X3
# 0.02774 1.04382
#
#
#
# [[2]]
# [[2]][[1]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1 X2 X1:X2
# -0.577 1.408 1.157 0.296
#
#
# [[2]][[2]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1 X3 X1:X3
# 0.7378 -0.6141 1.3707 -1.1076
#
#
# [[2]][[3]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X2 X3 X2:X3
# 0.257141 1.148571 1.290523 -0.009836
数据:
dat <- structure(list(X1 = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
-0.0946590384130976, 2.01842371387704, -0.062714099052421), X2 = c(1.30486965422349,
2.28664539270111, -1.38886070111234, -0.278788766817371, -0.133321336393658,
0.635950398070074, -0.284252921416072, -2.65645542090478, -2.44046692857552,
1.32011334573019), X3 = c(-0.306638594078475, -1.78130843398,
-0.171917355759621, 1.2146746991726, 1.89519346126497, -0.4304691316062,
-0.25726938276893, -1.76316308519478, 0.460097354831271, -0.639994875960119
), y = c(2.8246396305329, 0.645476124553837, -0.162546123564699,
0.959822161909057, 2.67109557131028, -1.61765192870095, 0.185540684874441,
-5.36518513868917, -2.37615350981384, 0.653526977609908)), row.names = c(NA,
-10L), class = "data.frame")
这是一种函数式编程方法。
您创建数据,只要您的 Y
是第一列,此代码就会采用所有其余变量(无论有多少)并根据它们的组合构建模型。
最后,既然你已经在这个框架中完成了,你可以调用 broom 的 tidy
和 confint_tidy
将结果提取到一个易于过滤的数据集中。
DF <- data_frame(Y = rpois(100, 5),
A = rnorm(100),
C = rnorm(100),
M = rnorm(100))
formula_frame <- bind_rows(data_frame(V1 = names(DF[,-1])),
as_data_frame(t(combn(names(DF[,-1]),2)))) %>%
rowwise() %>%
mutate(formula_text = paste0("Y ~", if_else(is.na(V2),
V1,
paste(V1,V2, sep = "*"))),
formula_obj = list(as.formula(formula_text))) %>%
ungroup()
formula_frame %>%
mutate(fits = map(formula_obj, ~glm(.x, family = "poisson", data = DF) %>%
(function(X)bind_cols(broom::tidy(X),broom::confint_tidy((X)))))) %>%
unnest(fits) %>%
select(-formula_obj)
# A tibble: 18 x 10
V1 V2 formula_text term estimate std.error statistic p.value conf.low conf.high
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A NA Y ~A (Intercept) 1.63 0.0443 36.8 6.92e-297 1.54 1.72
2 A NA Y ~A A 0.0268 0.0444 0.602 5.47e- 1 -0.0603 0.114
3 C NA Y ~C (Intercept) 1.63 0.0443 36.8 5.52e-296 1.54 1.72
4 C NA Y ~C C 0.0326 0.0466 0.699 4.84e- 1 -0.0587 0.124
5 M NA Y ~M (Intercept) 1.63 0.0454 35.8 1.21e-280 1.53 1.71
6 M NA Y ~M M -0.0291 0.0460 -0.634 5.26e- 1 -0.119 0.0615
7 A C Y ~A*C (Intercept) 1.62 0.0446 36.4 5.64e-290 1.54 1.71
8 A C Y ~A*C A 0.00814 0.0459 0.178 8.59e- 1 -0.0816 0.0982
9 A C Y ~A*C C 0.0410 0.0482 0.850 3.96e- 1 -0.0532 0.136
10 A C Y ~A*C A:C 0.0650 0.0474 1.37 1.70e- 1 -0.0270 0.158
11 A M Y ~A*M (Intercept) 1.62 0.0458 35.5 1.21e-275 1.53 1.71
12 A M Y ~A*M A 0.0232 0.0451 0.514 6.07e- 1 -0.0653 0.112
13 A M Y ~A*M M -0.0260 0.0464 -0.561 5.75e- 1 -0.116 0.0655
14 A M Y ~A*M A:M -0.00498 0.0480 -0.104 9.17e- 1 -0.0992 0.0887
15 C M Y ~C*M (Intercept) 1.60 0.0472 34.0 1.09e-253 1.51 1.70
16 C M Y ~C*M C 0.0702 0.0506 1.39 1.65e- 1 -0.0291 0.169
17 C M Y ~C*M M -0.0333 0.0479 -0.695 4.87e- 1 -0.127 0.0611
18 C M Y ~C*M C:M 0.0652 0.0377 1.73 8.39e- 2 -0.0102 0.138
这应该有效:
glmulti::glmulti(
Y = "Y",
xr = c("A", "C", "M"),
data = subs,
filename = "my_results",
family = "poisson"
)
它将创建一个文本文件 my_results.txt
,其中包含有关每个模型的信息。
您也可以将它作为 one-liner 与其他包一起使用,leaps
、bestglm
,可能还有其他包。
假设我有 3 个响应变量 A、C 和 M,我想为所有可能的模型拟合一个模型,即拟合 Y ~ A、Y ~ C、Y ~ M、Y ~ A * C、Y ~ A * M, Y ~ C * M, etc. 有没有一种快速的方法可以不用每次都手动指定交互?
不想写
M1 = glm(Y ~ A , data = subs, family = "poisson")
M2 = glm(Y ~ C , data = subs, family = "poisson")
M3 = glm(Y ~ M , data = subs, family = "poisson")
M4 = glm(Y ~ A*C , data = subs, family = "poisson")
...
实际上我有 3 个以上的变量并且想要某种循环,这是否可能。 谢谢
将所有 RHS 变量放入一个 v
向量中,并使用 combn
获得一和二的组合(使用 lapply
)。我们用 reformulate
和 paste
得到的公式用 *
折叠。在 combn
之后,我们需要另一个 lapply
来循环组合。
v <- c("X1", "X2", "X3")
res <- lapply(1:2, function(n) {
.env <- environment()
cb <- combn(c("X1", "X2", "X3"), n, function(x) paste(x, collapse=" * "))
lapply(cb, function(cb) lm(reformulate(cb, "y", env=.env), dat))
})
结果
res
# [[1]]
# [[1]][[1]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1
# -0.3433 0.3382
#
#
# [[1]][[2]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X2
# 0.008104 1.017076
#
#
# [[1]][[3]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X3
# 0.02774 1.04382
#
#
#
# [[2]]
# [[2]][[1]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1 X2 X1:X2
# -0.577 1.408 1.157 0.296
#
#
# [[2]][[2]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X1 X3 X1:X3
# 0.7378 -0.6141 1.3707 -1.1076
#
#
# [[2]][[3]]
#
# Call:
# lm(formula = reformulate(cb, "y", env = .env), data = dat)
#
# Coefficients:
# (Intercept) X2 X3 X2:X3
# 0.257141 1.148571 1.290523 -0.009836
数据:
dat <- structure(list(X1 = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
-0.0946590384130976, 2.01842371387704, -0.062714099052421), X2 = c(1.30486965422349,
2.28664539270111, -1.38886070111234, -0.278788766817371, -0.133321336393658,
0.635950398070074, -0.284252921416072, -2.65645542090478, -2.44046692857552,
1.32011334573019), X3 = c(-0.306638594078475, -1.78130843398,
-0.171917355759621, 1.2146746991726, 1.89519346126497, -0.4304691316062,
-0.25726938276893, -1.76316308519478, 0.460097354831271, -0.639994875960119
), y = c(2.8246396305329, 0.645476124553837, -0.162546123564699,
0.959822161909057, 2.67109557131028, -1.61765192870095, 0.185540684874441,
-5.36518513868917, -2.37615350981384, 0.653526977609908)), row.names = c(NA,
-10L), class = "data.frame")
这是一种函数式编程方法。
您创建数据,只要您的 Y
是第一列,此代码就会采用所有其余变量(无论有多少)并根据它们的组合构建模型。
最后,既然你已经在这个框架中完成了,你可以调用 broom 的 tidy
和 confint_tidy
将结果提取到一个易于过滤的数据集中。
DF <- data_frame(Y = rpois(100, 5),
A = rnorm(100),
C = rnorm(100),
M = rnorm(100))
formula_frame <- bind_rows(data_frame(V1 = names(DF[,-1])),
as_data_frame(t(combn(names(DF[,-1]),2)))) %>%
rowwise() %>%
mutate(formula_text = paste0("Y ~", if_else(is.na(V2),
V1,
paste(V1,V2, sep = "*"))),
formula_obj = list(as.formula(formula_text))) %>%
ungroup()
formula_frame %>%
mutate(fits = map(formula_obj, ~glm(.x, family = "poisson", data = DF) %>%
(function(X)bind_cols(broom::tidy(X),broom::confint_tidy((X)))))) %>%
unnest(fits) %>%
select(-formula_obj)
# A tibble: 18 x 10
V1 V2 formula_text term estimate std.error statistic p.value conf.low conf.high
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A NA Y ~A (Intercept) 1.63 0.0443 36.8 6.92e-297 1.54 1.72
2 A NA Y ~A A 0.0268 0.0444 0.602 5.47e- 1 -0.0603 0.114
3 C NA Y ~C (Intercept) 1.63 0.0443 36.8 5.52e-296 1.54 1.72
4 C NA Y ~C C 0.0326 0.0466 0.699 4.84e- 1 -0.0587 0.124
5 M NA Y ~M (Intercept) 1.63 0.0454 35.8 1.21e-280 1.53 1.71
6 M NA Y ~M M -0.0291 0.0460 -0.634 5.26e- 1 -0.119 0.0615
7 A C Y ~A*C (Intercept) 1.62 0.0446 36.4 5.64e-290 1.54 1.71
8 A C Y ~A*C A 0.00814 0.0459 0.178 8.59e- 1 -0.0816 0.0982
9 A C Y ~A*C C 0.0410 0.0482 0.850 3.96e- 1 -0.0532 0.136
10 A C Y ~A*C A:C 0.0650 0.0474 1.37 1.70e- 1 -0.0270 0.158
11 A M Y ~A*M (Intercept) 1.62 0.0458 35.5 1.21e-275 1.53 1.71
12 A M Y ~A*M A 0.0232 0.0451 0.514 6.07e- 1 -0.0653 0.112
13 A M Y ~A*M M -0.0260 0.0464 -0.561 5.75e- 1 -0.116 0.0655
14 A M Y ~A*M A:M -0.00498 0.0480 -0.104 9.17e- 1 -0.0992 0.0887
15 C M Y ~C*M (Intercept) 1.60 0.0472 34.0 1.09e-253 1.51 1.70
16 C M Y ~C*M C 0.0702 0.0506 1.39 1.65e- 1 -0.0291 0.169
17 C M Y ~C*M M -0.0333 0.0479 -0.695 4.87e- 1 -0.127 0.0611
18 C M Y ~C*M C:M 0.0652 0.0377 1.73 8.39e- 2 -0.0102 0.138
这应该有效:
glmulti::glmulti(
Y = "Y",
xr = c("A", "C", "M"),
data = subs,
filename = "my_results",
family = "poisson"
)
它将创建一个文本文件 my_results.txt
,其中包含有关每个模型的信息。
您也可以将它作为 one-liner 与其他包一起使用,leaps
、bestglm
,可能还有其他包。