绘制广义线性混合模型 (GLMM)：分类变量和数值变量的混合

Question

我想绘制 ladenant 响应变量数量与 Bioma（分类）和 temp（数值）函数之间的关系，使用二项负数广义线性混合模型 (GLMM) 没有成功。我尝试做：

#Packages
library(lme4)
library(ggplot2)
library(ggeffects)

#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable

# Negative binomial GLMM
m.laden.1  <- glmer.nb(ladenant ~ Bioma +  poly(temp,2) + scale(UR) + (1 | formigueiro), data = DataBase)

# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp","Bioma"))
ggplot(mydf, aes(x, predicted), group = Bioma) +
  geom_point(DataBase, aes(temp, ladenant), alpha = 0.5) + # Observed ladenant response variable
  geom_line() +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)

我有一个不太好的情节，因为我没有一行 Bioma 和 temp 变量的错误行：

但是对象 mydf 规范包含 Bioma 变量：

mydf 

# # Predicted counts of ladenant

# # Bioma = Atlantic Forest

# temp | Predicted |         95% CI
# ---------------------------------
#   10 |      1.88 | [ 0.81,  4.35]
#   15 |     12.95 | [ 9.11, 18.40]
#   20 |     32.61 | [26.42, 40.25]
#   25 |     30.00 | [23.51, 38.28]
#   30 |     10.08 | [ 4.79, 21.24]
#   35 |      1.24 | [ 0.24,  6.43]

# # Bioma = Transition

# temp | Predicted |          95% CI
# ----------------------------------
#   10 |      6.84 | [ 3.04,  15.42]
#   15 |     47.17 | [34.05,  65.34]
#   20 |    118.79 | [92.27, 152.94]
#   25 |    109.29 | [76.84, 155.43]
#   30 |     36.73 | [16.17,  83.44]
#   35 |      4.51 | [ 0.82,  24.71]

# # Bioma = Pampa

# temp | Predicted |         95% CI
# ---------------------------------
#   10 |      1.42 | [ 0.70,  2.90]
#   15 |      9.80 | [ 7.47, 12.86]
#   20 |     24.69 | [18.74, 32.52]
#   25 |     22.71 | [16.46, 31.35]
#   30 |      7.63 | [ 3.65, 15.96]
#   35 |      0.94 | [ 0.19,  4.67]

# Adjusted for:
# *          UR = 82.78
# * formigueiro = 0 (population-level)

拜托，有什么可以改进这个情节的吗？

Answer 1

我认为您只需要注意两个对象 myds 和 mydf 中变量的不同名称，以及将它们放置在对各种 [=14] 的调用中的位置=]s:

library(lme4)
#> Loading required package: Matrix
library(ggplot2)
library(ggeffects)

#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable

# Negative binomial GLMM
m.laden.1 <- glmer.nb(ladenant ~ Bioma +  poly(temp,2) + scale(UR) + (1 | formigueiro),
                      data = myds)

# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp [all]", "Bioma"))

ggplot(mydf, aes(x, predicted)) +
  geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5) + 
  geom_line(aes(color = group)) +
  labs(x = "temp", y = "ladenant")

请注意，我没有包括你的 geom_ribbon，因为 conf.low 和 conf.high 都在曲线的上半部分 NA，这使得它看起来很乱。

顺便说一下，如果使用 log y 比例尺，该图可能会提供更多信息：

ggplot(mydf, aes(x, predicted)) +
  geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5) + 
  geom_line(aes(color = group)) +
  scale_y_log10() +
  labs(x = "temp", y = "ladenant")

^{由 reprex package (v2.0.0)}

于 2021-11-12 创建

Answer 2

您还可以使用 plot()，其中 returns 一个 ggplot 对象，并根据需要添加额外的图层。

library(lme4)
#> Loading required package: Matrix
library(ggplot2)
library(ggeffects)

#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable

# Negative binomial GLMM
m.laden.1  <- glmer.nb(ladenant ~ Bioma +  poly(temp,2) + scale(UR) + (1 | formigueiro), data = myds)

mydf <- ggpredict(m.laden.1, terms = c("temp [all]","Bioma"))

plot(mydf, add.data = TRUE, ci = FALSE)

plot(mydf, add.data = TRUE, ci = FALSE) + ggplot2::scale_y_log10()
#> Scale for 'y' is already present. Adding another scale for 'y', which will
#> replace the existing scale.

^{由 reprex package (v2.0.1)}

于 2021-11-17 创建

绘制广义线性混合模型 (GLMM)：分类变量和数值变量的混合

Plot generalized linear mixed models (GLMMs): mixture of categorical and numeric variables

r

ggplot2

mixed-models