模型似乎在 x 轴上过度预测(ggeffects)

Model seems to be over-predicting on the x axis (ggeffects)

我的模型在 x 轴上的过度拟合似乎有问题。这导致趋势线从一个奇怪的地方开始。我正在寻找发生这种情况的原因?

data = read.csv('TotMaxSize.csv')

数据:

    structure(list(Column1 = 1:6, yrblock15 = c(2004L, 2004L, 2004L, 
2004L, 2004L, 2004L), circleID = 1:6, ThreeYearRain = c(748.9863518, 
744.4805429, 748.6081666, 747.5941999, 746.3382951, 740.9514718
), time = c(5.270172597, 4.270172617, 3.348596103, 3.019112219, 
2.905252281, 2.773856447), claylake = c(0, 0, 0, 0, 0.01, 0), 
    spinsandplain = c(99.53, 90.39, 50.7, 63.8, 73.65, 82.73), 
    TotMaxSize = c(2058.592458, 936.2305886, 1652.692998, 2162.200459, 
    1062.143104, 1863.051545)), row.names = c(NA, 6L), class = "data.frame")

加载的包:

library(ggplot2);library(lme4);library(ggeffects);library(dplyr) 

型号:

m3 <- lmer(TotMaxSize~log(time)+spinsandplain+ThreeYearRain+claylake+ThreeYearRain*log(time)+(1|circleID),na.action=na.fail,data=data,REML=FALSE)

剧情:

d <-ggpredict(m3, terms = "time[exp]")
d <- rename(d, "time" = x, "TotMaxSize" = predicted)
ggplot(d, aes(time, TotMaxSize)) + 
  geom_point(data = data, colour = "orangered3") + 
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1) +
  geom_line(size = 2, colour = "black") +
  theme_bw()

它产生这个:

如果我限制轴,它会产生这个:

但是趋势线似乎从一个奇怪的地方开始?

您正在使用 ggpredict(m3, terms = "time[exp]"),即您对 time 求幂。我假设时间不能为负,所以最小值是 exp(0),也就是 1(只是一个猜测)。

此外,如果您对随时间的变化感兴趣,我可能不会在模型中使用 log(time),而是使用 log(TotMaxSize),或者使用 y- 的对数变换轴(scale_y_log10())。 log(time)是故意的吗?

另一种方法是创建一个具有指数值的向量,然后手动添加第一个值:

v <- c(0, 0.5, exp(seq(0, 10, .5)))
ggpredict(m3, terms = "time[v]")

这是一个玩具示例:

library(lme4)
#> Loading required package: Matrix
library(ggeffects)
library(ggplot2)
data("sleepstudy")
sleepstudy$Days <- sleepstudy$Days + 1
m <- lmer(Reaction ~ log(Days) + (1 + Days | Subject), data = sleepstudy)

## not starting at "0"

plot(ggpredict(m, "Days [exp]")) + 
  xlim(c(0, 25)) +
  coord_cartesian()
#> Warning: Removed 7 row(s) containing missing values (geom_path).

## starting at "0"

v <- c(0.1, 0.5, exp(seq(0, 5, .5)))

plot(ggpredict(m, "Days [v]")) + 
  xlim(c(0, 25)) +
  coord_cartesian()
#> Warning: Removed 4 row(s) containing missing values (geom_path).

reprex package (v2.0.0)

于 2021-04-25 创建