在 R 中拟合通用次数的多元多项式而无需编写显式公式

Fitting a multivariate polynomial of generic degree in R without having to write the explicit formula

我想对一些数据拟合任意次数和任意数量变量的多元多项式。变量的数量可能很多(例如 40)并且代码应该适用于不同数量的变量(例如 10、20、40 等),因此不可能明确地写出公式。对于 1 次多项式(即经典线性模型),解决方案很简单:假设我的数据在数据帧 df 中,那么

mymodel <- lm(y ~ ., data = df)

不幸的是,当多项式是任意次数时,我不知道类似的紧凑公式。你能帮帮我吗?

您可以使用此函数 makepoly 根据公式和数据框生成包含多项式项的公式。

makepoly <- function(form, data, degree = 1) {
  mt <- terms(form, data = data)
  tl <- attr(mt, "term.labels")
  resp <- tl[attr(mt, "response")]
  reformulate(paste0("poly(", tl, ", ", degree, ")"), 
              response = form[[2]])
}

一个测试数据集:

set.seed(1)
df <- data.frame(y = rnorm(10), 
                 x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))

创建公式并运行回归:

form <- makepoly(y ~ ., df, degree = 2)
# y ~ poly(x1, 2) + poly(x2, 2) + poly(x3, 2)


lm(form, df)
# 
# Call:
#   lm(formula = form, data = df)
# 
# Coefficients:
#   (Intercept)  poly(x1, 2)1  poly(x1, 2)2  poly(x2, 2)1  
# 0.1322        0.1445       -5.5757       -5.2132  
# poly(x2, 2)2  poly(x3, 2)1  poly(x3, 2)2  
# 4.2297        0.7895        3.9796  

这结合了我之前发布的两个选项(交互项和多项式项),在假设情况下列名看起来像 "X1"、"X2"、....、"X30".您将取出其中的 terms() 调用以证明它已成功:

terms( as.formula( 
     paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", 
            paste( "poly(", paste0("X", 1:30), ", degree=2)", 
                    collapse="+"), 
          collapse="")
      )         )

您可以使用 names(dfrm)[!names(dfrm) %in% "y"] 之类的表达式来代替内部 paste0 调用。

请注意,交互项是通过 R 公式过程在 (...)^2 机制中构建的,该机制不会创建平方项,而是所有两种方式的交互:

as.formula( 
        paste(" ~ (", paste0("X", 1:30 , collapse="+"), ")^2", "+", paste( "poly(", paste0("X", 1:30), ", degree=2)", collapse="+"), collapse="")
        ) 
#----output----
~(X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 + 
    X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 + 
    X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30)^2 + poly(X1, 
    degree = 2) + poly(X2, degree = 2) + 
    poly(X3, degree = 2) + 
    poly(X4, degree = 2) + poly(X5, degree = 2) + poly(X6, degree = 2) + 
    poly(X7, degree = 2) + poly(X8, degree = 2) + poly(X9, degree = 2) + 
    poly(X10, degree = 2) + poly(X11, degree = 2) + poly(X12, 
     degree = 2) + poly(X13, degree = 2) + poly(X14, degree = 2) + 
    poly(X15, degree = 2) + poly(X16, degree = 2) + poly(X17, 
     degree = 2) + poly(X18, degree = 2) + poly(X19, degree = 2) + 
    poly(X20, degree = 2) + poly(X21, degree = 2) + poly(X22, 
     degree = 2) + poly(X23, degree = 2) + poly(X24, degree = 2) + 
    poly(X25, degree = 2) + poly(X26, degree = 2) + poly(X27, 
     degree = 2) + poly(X28, degree = 2) + poly(X29, degree = 2) + 
    poly(X30, degree = 2)