比较多个模型的回归模型系数的森林图分面网格

Question

我目前正在处理 30 个列名相同但数值数据不同的数据集。我需要对数据集的每个实例应用线性混合模型和广义线性模型，并将得到的固定效应系数绘制在森林图上。

数据的当前结构如下（为简单起见，对每个列表元素使用相同的数据集）：

library(lme4)

data_list <- list()

# There's definitely a better way of doing this through lapply(), I just can't figure out how
for (i in 1:30){
     data_list[[i]] <- tibble::as_tibble(mtcars) # this would originally load different data at every instance
}

compute_model_lmm <- function(data){
     lmer("mpg ~ hp + disp + drat + (1|cyl)", data = data)
}

result_list_lmm <- lapply(data_list, compute_model_lmm)

我目前正在做的是

library(modelsummary)

modelplot(result_list_lmm)+
     facet_wrap(~model) #modelplot() takes arguments/functions from ggplot2

这会花费大量时间，但它确实有效。

现在，我想比较同一地块上的另一个模型，如

compute_model_glm <- function(data){
     glm("mpg ~ hp + disp + drat + cyl", data = data)
}

result_list_glm <- lapply(data_list, compute_model_glm)

modelplot(list(result_list_lmm[[1]], result_list_glm[[1]]))

但是对于情节的每个实例。

如何将它指定为 modelplot()？

提前致谢！

Answer 1

modelplot 函数为您提供了一些绘制系数和区间的基本方法（例如，检查 facet 参数）。

然而，函数的真正威力来自使用 draw=FALSE 参数。在这种情况下，modelplot 会在一个方便的数据框架中为您提供估计值，并使用 modelplot 函数的所有重命名、稳健的标准误差和其他实用程序。然后，您可以使用该数据框自己绘制 ggplot2 以进行无限自定义。

library(modelsummary)
library(ggplot2)

results_lm <- lapply(1:10, function(x) lm(hp ~ mpg, data = mtcars)) |>
    modelplot(draw = FALSE) |>
    transform("Function" = "lm()")
results_glm <- lapply(1:10, function(x) glm(hp ~ mpg, data = mtcars)) |>
    modelplot(draw = FALSE) |>
    transform("Function" = "glm()")
results <- rbind(results_lm, results_glm) 

head(results)
          term   model estimate std.error conf.low conf.high Function
1  (Intercept) Model 1 324.0823   27.4333  268.056  380.1086     lm()
3  (Intercept) Model 2 324.0823   27.4333  268.056  380.1086     lm()
5  (Intercept) Model 3 324.0823   27.4333  268.056  380.1086     lm()
7  (Intercept) Model 4 324.0823   27.4333  268.056  380.1086     lm()
9  (Intercept) Model 5 324.0823   27.4333  268.056  380.1086     lm()
11 (Intercept) Model 6 324.0823   27.4333  268.056  380.1086     lm()

ggplot(results, aes(y = term, x = estimate, xmin = conf.low, xmax = conf.high)) +
    geom_pointrange(aes(color = Function), position = position_dodge(width = .5)) +
    facet_wrap(~model)

比较多个模型的回归模型系数的森林图分面网格

Forest plot facet grid comparing regression model coefficients from multiple models

r

ggplot2

facet-wrap

forestplot

modelsummary