为什么 lm 函数会给出高得离谱的结果?

Why is the lm function giving absurdly high results?

首先,我会给你一些可重现的代码:

library(ggplot2)

y = c(0, 0, 1, 2, 0,  0, 1,  3,  0,  0,  3, 0, 6, 2, 8, 16, 21, 39, 48, 113, 92, 93 ,127, 159, 137, 46, 238, 132 ,124, 185 ,171, 250, 250 ,187, 119 ,151, 292,  94, 281, 146, 163 ,104, 156, 272, 273, 212, 210, 135, 187, 208, 310, 276 ,235, 246, 190, 232, 254, 446,
314, 402 ,276, 279, 386 ,402, 238, 581, 434, 159, 261, 356, 440, 498, 495, 462 ,306, 233, 396, 331, 418, 293 ,431 ,300, 222, 222, 479 ,501, 702
,790, 681)
x = 1:length(y)

现在,我正在尝试为该数据集构建一个 3 次多项式回归曲线。我想知道这个模型的系数,summary(lm(formula=y~poly(x,3)))。我得到了一个荒谬的结果。

Call:
lm(formula = y ~ poly(x, 3))

Residuals:
     Min       1Q   Median       3Q      Max 
-253.696  -47.582   -9.709   44.314  271.183 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  223.978      9.703  23.083   <2e-16 ***
poly(x, 3)1 1420.644     91.538  15.520   <2e-16 ***
poly(x, 3)2   62.375     91.538   0.681    0.497    
poly(x, 3)3  130.161     91.538   1.422    0.159    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 91.54 on 85 degrees of freedom
Multiple R-squared:  0.7411,    Adjusted R-squared:  0.732 
F-statistic: 81.12 on 3 and 85 DF,  p-value: < 2.2e-16

这对于我的模型来说高得离谱,我很困惑为什么要返回此输出。

为什么会这样?我哪里错了?

我想你想要的是:

lm(y ~ poly(x, 3, raw = TRUE))

希望对您有所帮助!