在 R 中拟合通用次数的多元多项式而无需编写显式公式
Fitting a multivariate polynomial of generic degree in R without having to write the explicit formula
我想对一些数据拟合任意次数和任意数量变量的多元多项式。变量的数量可能很多(例如 40)并且代码应该适用于不同数量的变量(例如 10、20、40 等),因此不可能明确地写出公式。对于 1 次多项式(即经典线性模型),解决方案很简单:假设我的数据在数据帧 df 中,那么
mymodel <- lm(y ~ ., data = df)
不幸的是,当多项式是任意次数时,我不知道类似的紧凑公式。你能帮帮我吗?
您可以使用此函数 makepoly
根据公式和数据框生成包含多项式项的公式。
makepoly <- function(form, data, degree = 1) {
mt <- terms(form, data = data)
tl <- attr(mt, "term.labels")
resp <- tl[attr(mt, "response")]
reformulate(paste0("poly(", tl, ", ", degree, ")"),
response = form[[2]])
}
一个测试数据集:
set.seed(1)
df <- data.frame(y = rnorm(10),
x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
创建公式并运行回归:
form <- makepoly(y ~ ., df, degree = 2)
# y ~ poly(x1, 2) + poly(x2, 2) + poly(x3, 2)
lm(form, df)
#
# Call:
# lm(formula = form, data = df)
#
# Coefficients:
# (Intercept) poly(x1, 2)1 poly(x1, 2)2 poly(x2, 2)1
# 0.1322 0.1445 -5.5757 -5.2132
# poly(x2, 2)2 poly(x3, 2)1 poly(x3, 2)2
# 4.2297 0.7895 3.9796
这结合了我之前发布的两个选项(交互项和多项式项),在假设情况下列名看起来像 "X1"、"X2"、....、"X30".您将取出其中的 terms() 调用以证明它已成功:
terms( as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+",
paste( "poly(", paste0("X", 1:30), ", degree=2)",
collapse="+"),
collapse="")
) )
您可以使用 names(dfrm)[!names(dfrm) %in% "y"]
之类的表达式来代替内部 paste0
调用。
请注意,交互项是通过 R 公式过程在 (...)^2 机制中构建的,该机制不会创建平方项,而是所有两种方式的交互:
as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", paste( "poly(", paste0("X", 1:30), ", degree=2)", collapse="+"), collapse="")
)
#----output----
~(X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 +
X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 +
X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30)^2 + poly(X1,
degree = 2) + poly(X2, degree = 2) +
poly(X3, degree = 2) +
poly(X4, degree = 2) + poly(X5, degree = 2) + poly(X6, degree = 2) +
poly(X7, degree = 2) + poly(X8, degree = 2) + poly(X9, degree = 2) +
poly(X10, degree = 2) + poly(X11, degree = 2) + poly(X12,
degree = 2) + poly(X13, degree = 2) + poly(X14, degree = 2) +
poly(X15, degree = 2) + poly(X16, degree = 2) + poly(X17,
degree = 2) + poly(X18, degree = 2) + poly(X19, degree = 2) +
poly(X20, degree = 2) + poly(X21, degree = 2) + poly(X22,
degree = 2) + poly(X23, degree = 2) + poly(X24, degree = 2) +
poly(X25, degree = 2) + poly(X26, degree = 2) + poly(X27,
degree = 2) + poly(X28, degree = 2) + poly(X29, degree = 2) +
poly(X30, degree = 2)
我想对一些数据拟合任意次数和任意数量变量的多元多项式。变量的数量可能很多(例如 40)并且代码应该适用于不同数量的变量(例如 10、20、40 等),因此不可能明确地写出公式。对于 1 次多项式(即经典线性模型),解决方案很简单:假设我的数据在数据帧 df 中,那么
mymodel <- lm(y ~ ., data = df)
不幸的是,当多项式是任意次数时,我不知道类似的紧凑公式。你能帮帮我吗?
您可以使用此函数 makepoly
根据公式和数据框生成包含多项式项的公式。
makepoly <- function(form, data, degree = 1) {
mt <- terms(form, data = data)
tl <- attr(mt, "term.labels")
resp <- tl[attr(mt, "response")]
reformulate(paste0("poly(", tl, ", ", degree, ")"),
response = form[[2]])
}
一个测试数据集:
set.seed(1)
df <- data.frame(y = rnorm(10),
x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))
创建公式并运行回归:
form <- makepoly(y ~ ., df, degree = 2)
# y ~ poly(x1, 2) + poly(x2, 2) + poly(x3, 2)
lm(form, df)
#
# Call:
# lm(formula = form, data = df)
#
# Coefficients:
# (Intercept) poly(x1, 2)1 poly(x1, 2)2 poly(x2, 2)1
# 0.1322 0.1445 -5.5757 -5.2132
# poly(x2, 2)2 poly(x3, 2)1 poly(x3, 2)2
# 4.2297 0.7895 3.9796
这结合了我之前发布的两个选项(交互项和多项式项),在假设情况下列名看起来像 "X1"、"X2"、....、"X30".您将取出其中的 terms() 调用以证明它已成功:
terms( as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+",
paste( "poly(", paste0("X", 1:30), ", degree=2)",
collapse="+"),
collapse="")
) )
您可以使用 names(dfrm)[!names(dfrm) %in% "y"]
之类的表达式来代替内部 paste0
调用。
请注意,交互项是通过 R 公式过程在 (...)^2 机制中构建的,该机制不会创建平方项,而是所有两种方式的交互:
as.formula(
paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", paste( "poly(", paste0("X", 1:30), ", degree=2)", collapse="+"), collapse="")
)
#----output----
~(X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 +
X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 +
X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30)^2 + poly(X1,
degree = 2) + poly(X2, degree = 2) +
poly(X3, degree = 2) +
poly(X4, degree = 2) + poly(X5, degree = 2) + poly(X6, degree = 2) +
poly(X7, degree = 2) + poly(X8, degree = 2) + poly(X9, degree = 2) +
poly(X10, degree = 2) + poly(X11, degree = 2) + poly(X12,
degree = 2) + poly(X13, degree = 2) + poly(X14, degree = 2) +
poly(X15, degree = 2) + poly(X16, degree = 2) + poly(X17,
degree = 2) + poly(X18, degree = 2) + poly(X19, degree = 2) +
poly(X20, degree = 2) + poly(X21, degree = 2) + poly(X22,
degree = 2) + poly(X23, degree = 2) + poly(X24, degree = 2) +
poly(X25, degree = 2) + poly(X26, degree = 2) + poly(X27,
degree = 2) + poly(X28, degree = 2) + poly(X29, degree = 2) +
poly(X30, degree = 2)