使用 nest 和 tidy 在 RMarkdown 中创建格式化的 html 回归输出
Using nest and tidy to create formatted html regression output in RMarkdown
我使用了tidyr::nest
到运行一系列具有不同因变量的逻辑回归模型。我想在 RMarkdown 中将结果输出为单个 html table,每个模型作为列,行作为指数系数和 99% CI。我无法弄清楚 nest
、tidy
和像 stargazer
这样的 table 渲染包之间的工作流来让它工作。如果我 unnest
我的 tidy
输出并将其传递给 stargazer
,或者如果我只是尝试传递 nest
ed 输出(嵌套数据框中的变量称为“模型" below) 到 stargazer
直接,我没有输出。由于指数系数和 99% CI,我更愿意使用 tidy
输出。我基本上需要 this vignette to go one step farther and explain how to use the output of nest
and tidy
to create formatted regression tables. I also looked at ,但我很难相信没有更简单的方法来做到这一点,而我只是想念它。
样本数据,以及我运行构建模型的一般方法:
id <- 1:2000
gender <- sample(0:1, 2000, replace = T)
age <- sample(17:64, 2000, replace = T)
race <- sample(0:1, 2000, replace = T)
cond_a <- sample(0:1, 2000, replace = T)
cond_b <- sample(0:1, 2000, replace = T)
cond_c <- sample(0:1, 2000, replace = T)
cond_d <- sample(0:1, 2000, replace = T)
df <- data.frame(id, gender, age, race, cond_a, cond_b, cond_c, cond_d)
df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>%
group_by(condition) %>% nest() %>%
mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))
希望我没听错,本质上,您使用 .
分别将 lm 对象和置信区间传递给 stargazer
您可以阅读 stargazer help page 了解如何输入,例如自定义 ci 需要:
ci.custom: a list of two-column numeric matrices that will replace the
default confidence intervals for each model. The first and
second columns represent the lower and the upper bounds,
respectively. Matched by element names.
所以在你的情况下,需要多做一些工作,我们先存储结果。
result = df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>%
group_by(condition) %>% nest() %>%
mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))
tidy_model = result$tidy
fit = result$model
然后拉出CI和coefficients:
CI = lapply(tidy_model,function(i)as.matrix(i[,c("conf.low","conf.high")]))
Coef = lapply(tidy_model,"[[","estimate")
然后申请stargazer
:
stargazer(fit, type = "text",
coef = Coef,
ci.custom = CI)
=============================================================================
Dependent variable:
-----------------------------------------------------------
case
(1) (2) (3) (4)
-----------------------------------------------------------------------------
gender 0.996*** 1.182*** 1.196*** 0.921***
(0.790, 1.254) (0.938, 1.489) (0.950, 1.508) (0.731, 1.161)
age 1.004*** 1.001*** 0.999*** 1.005***
(0.995, 1.012) (0.993, 1.009) (0.990, 1.007) (0.996, 1.013)
race 0.911*** 0.895*** 0.944*** 1.213***
(0.724, 1.148) (0.711, 1.128) (0.749, 1.189) (0.963, 1.529)
Constant 0.919*** 0.997*** 0.959*** 0.761***
(0.623, 1.356) (0.676, 1.472) (0.649, 1.415) (0.515, 1.123)
-----------------------------------------------------------------------------
Observations 2,000 2,000 2,000 2,000
Log Likelihood -1,385.107 -1,382.664 -1,383.411 -1,382.272
Akaike Inf. Crit. 2,778.215 2,773.329 2,774.821 2,772.544
=============================================================================
Note: *p<0.1; **p<0.05; ***p<0.01
我使用了tidyr::nest
到运行一系列具有不同因变量的逻辑回归模型。我想在 RMarkdown 中将结果输出为单个 html table,每个模型作为列,行作为指数系数和 99% CI。我无法弄清楚 nest
、tidy
和像 stargazer
这样的 table 渲染包之间的工作流来让它工作。如果我 unnest
我的 tidy
输出并将其传递给 stargazer
,或者如果我只是尝试传递 nest
ed 输出(嵌套数据框中的变量称为“模型" below) 到 stargazer
直接,我没有输出。由于指数系数和 99% CI,我更愿意使用 tidy
输出。我基本上需要 this vignette to go one step farther and explain how to use the output of nest
and tidy
to create formatted regression tables. I also looked at
样本数据,以及我运行构建模型的一般方法:
id <- 1:2000
gender <- sample(0:1, 2000, replace = T)
age <- sample(17:64, 2000, replace = T)
race <- sample(0:1, 2000, replace = T)
cond_a <- sample(0:1, 2000, replace = T)
cond_b <- sample(0:1, 2000, replace = T)
cond_c <- sample(0:1, 2000, replace = T)
cond_d <- sample(0:1, 2000, replace = T)
df <- data.frame(id, gender, age, race, cond_a, cond_b, cond_c, cond_d)
df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>%
group_by(condition) %>% nest() %>%
mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))
希望我没听错,本质上,您使用
stargazer
您可以阅读 stargazer help page 了解如何输入,例如自定义 ci 需要:
ci.custom: a list of two-column numeric matrices that will replace the default confidence intervals for each model. The first and second columns represent the lower and the upper bounds, respectively. Matched by element names.
所以在你的情况下,需要多做一些工作,我们先存储结果。
result = df %>% gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>%
group_by(condition) %>% nest() %>%
mutate(model = map(data, ~glm(case ~ gender + age + race, family = "binomial", data = .)),
tidy = map(model, tidy, exponentiate = T, conf.int = T, conf.level = 0.99))
tidy_model = result$tidy
fit = result$model
然后拉出CI和coefficients:
CI = lapply(tidy_model,function(i)as.matrix(i[,c("conf.low","conf.high")]))
Coef = lapply(tidy_model,"[[","estimate")
然后申请stargazer
:
stargazer(fit, type = "text",
coef = Coef,
ci.custom = CI)
=============================================================================
Dependent variable:
-----------------------------------------------------------
case
(1) (2) (3) (4)
-----------------------------------------------------------------------------
gender 0.996*** 1.182*** 1.196*** 0.921***
(0.790, 1.254) (0.938, 1.489) (0.950, 1.508) (0.731, 1.161)
age 1.004*** 1.001*** 0.999*** 1.005***
(0.995, 1.012) (0.993, 1.009) (0.990, 1.007) (0.996, 1.013)
race 0.911*** 0.895*** 0.944*** 1.213***
(0.724, 1.148) (0.711, 1.128) (0.749, 1.189) (0.963, 1.529)
Constant 0.919*** 0.997*** 0.959*** 0.761***
(0.623, 1.356) (0.676, 1.472) (0.649, 1.415) (0.515, 1.123)
-----------------------------------------------------------------------------
Observations 2,000 2,000 2,000 2,000
Log Likelihood -1,385.107 -1,382.664 -1,383.411 -1,382.272
Akaike Inf. Crit. 2,778.215 2,773.329 2,774.821 2,772.544
=============================================================================
Note: *p<0.1; **p<0.05; ***p<0.01