拟合多个线性回归模型时避免重复写模型公式

Question

我想运行一些类似的R中的线性回归模型，比如

lm(y ~ x1 + x2 + x3 + x4 + x5, data = df)
lm(y ~ x1 + x2 + x3 + x4 + x5 + x6, data = df)
lm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7, data = df)

如何将其中的一部分分配给 "base" 公式，以避免重复多次？这将是基础：

y ~ x1 + x2 + x3 + x4 + x5

那我怎样才能做类似下面的事情（显然不起作用）？

lm(base + x6, data = df)

在 Stack Overflow 上搜索我意识到我可以制作一个只包含感兴趣变量的数据框并使用 . 来缩短模型公式，但我想知道是否可以避免这种情况。

Answer 1

您可以使用 update.formula 更新模型公式。例如：

base <- y ~ x1 + x2 + x3 + x4 + x5
update.formula(base, . ~ . + x6)
#y ~ x1 + x2 + x3 + x4 + x5 + x6

如果您想提供新的变量名称作为字符，这是一个字符串版本：

## `deparse` damp a model formula to a string
formula(paste(deparse(base), "x6", sep = " + "))

事实上，您甚至可以直接更新您的模型

fit <- lm(base, dat); update.default(fit, . ~ . + x6)

This idea that updates the whole model worked the best. Only update() was needed in my case.

我写了 update.default 和 update.formula 以便您知道在对文档执行 ? 时要查找的函数。

avoid repeatedly writing model formula when fitting a number of linear regression models