模型似乎在 x 轴上过度预测(ggeffects)
Model seems to be over-predicting on the x axis (ggeffects)
我的模型在 x 轴上的过度拟合似乎有问题。这导致趋势线从一个奇怪的地方开始。我正在寻找发生这种情况的原因?
data = read.csv('TotMaxSize.csv')
数据:
structure(list(Column1 = 1:6, yrblock15 = c(2004L, 2004L, 2004L,
2004L, 2004L, 2004L), circleID = 1:6, ThreeYearRain = c(748.9863518,
744.4805429, 748.6081666, 747.5941999, 746.3382951, 740.9514718
), time = c(5.270172597, 4.270172617, 3.348596103, 3.019112219,
2.905252281, 2.773856447), claylake = c(0, 0, 0, 0, 0.01, 0),
spinsandplain = c(99.53, 90.39, 50.7, 63.8, 73.65, 82.73),
TotMaxSize = c(2058.592458, 936.2305886, 1652.692998, 2162.200459,
1062.143104, 1863.051545)), row.names = c(NA, 6L), class = "data.frame")
加载的包:
library(ggplot2);library(lme4);library(ggeffects);library(dplyr)
型号:
m3 <- lmer(TotMaxSize~log(time)+spinsandplain+ThreeYearRain+claylake+ThreeYearRain*log(time)+(1|circleID),na.action=na.fail,data=data,REML=FALSE)
剧情:
d <-ggpredict(m3, terms = "time[exp]")
d <- rename(d, "time" = x, "TotMaxSize" = predicted)
ggplot(d, aes(time, TotMaxSize)) +
geom_point(data = data, colour = "orangered3") +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1) +
geom_line(size = 2, colour = "black") +
theme_bw()
它产生这个:
如果我限制轴,它会产生这个:
但是趋势线似乎从一个奇怪的地方开始?
您正在使用 ggpredict(m3, terms = "time[exp]")
,即您对 time
求幂。我假设时间不能为负,所以最小值是 exp(0)
,也就是 1
(只是一个猜测)。
此外,如果您对随时间的变化感兴趣,我可能不会在模型中使用 log(time)
,而是使用 log(TotMaxSize)
,或者使用 y- 的对数变换轴(scale_y_log10()
)。 log(time)
是故意的吗?
另一种方法是创建一个具有指数值的向量,然后手动添加第一个值:
v <- c(0, 0.5, exp(seq(0, 10, .5)))
ggpredict(m3, terms = "time[v]")
这是一个玩具示例:
library(lme4)
#> Loading required package: Matrix
library(ggeffects)
library(ggplot2)
data("sleepstudy")
sleepstudy$Days <- sleepstudy$Days + 1
m <- lmer(Reaction ~ log(Days) + (1 + Days | Subject), data = sleepstudy)
## not starting at "0"
plot(ggpredict(m, "Days [exp]")) +
xlim(c(0, 25)) +
coord_cartesian()
#> Warning: Removed 7 row(s) containing missing values (geom_path).
## starting at "0"
v <- c(0.1, 0.5, exp(seq(0, 5, .5)))
plot(ggpredict(m, "Days [v]")) +
xlim(c(0, 25)) +
coord_cartesian()
#> Warning: Removed 4 row(s) containing missing values (geom_path).
由 reprex package (v2.0.0)
于 2021-04-25 创建
我的模型在 x 轴上的过度拟合似乎有问题。这导致趋势线从一个奇怪的地方开始。我正在寻找发生这种情况的原因?
data = read.csv('TotMaxSize.csv')
数据:
structure(list(Column1 = 1:6, yrblock15 = c(2004L, 2004L, 2004L,
2004L, 2004L, 2004L), circleID = 1:6, ThreeYearRain = c(748.9863518,
744.4805429, 748.6081666, 747.5941999, 746.3382951, 740.9514718
), time = c(5.270172597, 4.270172617, 3.348596103, 3.019112219,
2.905252281, 2.773856447), claylake = c(0, 0, 0, 0, 0.01, 0),
spinsandplain = c(99.53, 90.39, 50.7, 63.8, 73.65, 82.73),
TotMaxSize = c(2058.592458, 936.2305886, 1652.692998, 2162.200459,
1062.143104, 1863.051545)), row.names = c(NA, 6L), class = "data.frame")
加载的包:
library(ggplot2);library(lme4);library(ggeffects);library(dplyr)
型号:
m3 <- lmer(TotMaxSize~log(time)+spinsandplain+ThreeYearRain+claylake+ThreeYearRain*log(time)+(1|circleID),na.action=na.fail,data=data,REML=FALSE)
剧情:
d <-ggpredict(m3, terms = "time[exp]")
d <- rename(d, "time" = x, "TotMaxSize" = predicted)
ggplot(d, aes(time, TotMaxSize)) +
geom_point(data = data, colour = "orangered3") +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1) +
geom_line(size = 2, colour = "black") +
theme_bw()
它产生这个:
如果我限制轴,它会产生这个:
但是趋势线似乎从一个奇怪的地方开始?
您正在使用 ggpredict(m3, terms = "time[exp]")
,即您对 time
求幂。我假设时间不能为负,所以最小值是 exp(0)
,也就是 1
(只是一个猜测)。
此外,如果您对随时间的变化感兴趣,我可能不会在模型中使用 log(time)
,而是使用 log(TotMaxSize)
,或者使用 y- 的对数变换轴(scale_y_log10()
)。 log(time)
是故意的吗?
另一种方法是创建一个具有指数值的向量,然后手动添加第一个值:
v <- c(0, 0.5, exp(seq(0, 10, .5)))
ggpredict(m3, terms = "time[v]")
这是一个玩具示例:
library(lme4)
#> Loading required package: Matrix
library(ggeffects)
library(ggplot2)
data("sleepstudy")
sleepstudy$Days <- sleepstudy$Days + 1
m <- lmer(Reaction ~ log(Days) + (1 + Days | Subject), data = sleepstudy)
## not starting at "0"
plot(ggpredict(m, "Days [exp]")) +
xlim(c(0, 25)) +
coord_cartesian()
#> Warning: Removed 7 row(s) containing missing values (geom_path).
## starting at "0"
v <- c(0.1, 0.5, exp(seq(0, 5, .5)))
plot(ggpredict(m, "Days [v]")) +
xlim(c(0, 25)) +
coord_cartesian()
#> Warning: Removed 4 row(s) containing missing values (geom_path).
由 reprex package (v2.0.0)
于 2021-04-25 创建