如何在广义相加模型中指定两个因子变量的非线性相互作用 [R]
How to specify the non-linear interaction of two factor variables in generalised additive models [R]
我有一个时间序列数据集,其中包含一个连续的结果变量和两个因子预测变量(一个有 6 个水平,一个有 2 个水平)。
我想对连续变量上两个因子变量的非线性交互作用建模。
这是我目前拥有的模型:
library(mgcv)
model <- bam(
outcome ~
factor_1 + factor_2 +
s(time, k = 9) +
s(time, by = factor_1, k = 9) +
s(time, by = factor_2, k = 9),
data = df
)
summary(model)
Family: gaussian
Link function: identity
Formula:
outcome ~ factor_1 + factor_2 + s(time, k = 9) + s(time, by = factor_1,
k = 9) + s(time, by = factor_2, k = 9)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2612.72 23.03 113.465 <2e-16 ***
factor_1b 33.19 27.00 1.229 0.22
factor_2z -488.52 27.00 -18.093 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(time) 2.564 3.184 6.408 0.000274 ***
s(time):factor_1b 1.000 1.001 0.295 0.587839
s(time):factor_2z 2.246 2.792 34.281 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.679 Deviance explained = 69.1%
fREML = 1359.6 Scale est. = 37580 n = 207
现在我想添加一个 factor_1
和 factor_2
与 time
的非线性交互作用,以影响 outcome
,以便在每个组合可能不同(例如:factor_2
对于 factor_1
的某些级别具有更强的非线性效应)。 s(time, factor_1, factor_2)
或 s(time, factor_1, by = factor_2)
之类的东西不起作用。
使用 interaction()
包括两个因素的相互作用似乎可以完成这项工作。
library(mgcv)
# The following assumes factors are ordered with treatment contrast.
model <- bam(
outcome ~
interaction(factor_1, factor_2) +
s(time, k = 9) +
s(time, by = interaction(factor_1, factor_2), k = 9),
data = df
)
我有一个时间序列数据集,其中包含一个连续的结果变量和两个因子预测变量(一个有 6 个水平,一个有 2 个水平)。
我想对连续变量上两个因子变量的非线性交互作用建模。
这是我目前拥有的模型:
library(mgcv)
model <- bam(
outcome ~
factor_1 + factor_2 +
s(time, k = 9) +
s(time, by = factor_1, k = 9) +
s(time, by = factor_2, k = 9),
data = df
)
summary(model)
Family: gaussian
Link function: identity
Formula:
outcome ~ factor_1 + factor_2 + s(time, k = 9) + s(time, by = factor_1,
k = 9) + s(time, by = factor_2, k = 9)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2612.72 23.03 113.465 <2e-16 ***
factor_1b 33.19 27.00 1.229 0.22
factor_2z -488.52 27.00 -18.093 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(time) 2.564 3.184 6.408 0.000274 ***
s(time):factor_1b 1.000 1.001 0.295 0.587839
s(time):factor_2z 2.246 2.792 34.281 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.679 Deviance explained = 69.1%
fREML = 1359.6 Scale est. = 37580 n = 207
现在我想添加一个 factor_1
和 factor_2
与 time
的非线性交互作用,以影响 outcome
,以便在每个组合可能不同(例如:factor_2
对于 factor_1
的某些级别具有更强的非线性效应)。 s(time, factor_1, factor_2)
或 s(time, factor_1, by = factor_2)
之类的东西不起作用。
使用 interaction()
包括两个因素的相互作用似乎可以完成这项工作。
library(mgcv)
# The following assumes factors are ordered with treatment contrast.
model <- bam(
outcome ~
interaction(factor_1, factor_2) +
s(time, k = 9) +
s(time, by = interaction(factor_1, factor_2), k = 9),
data = df
)