使用线性模型进行预测以及 data.frame 的重要性

Question

我写信是想问为什么我们要添加 data.frame() 来使用 lm

进行预测

第一段代码应该是错误的，第二段代码应该是正确的。

dim(iris)
model_1<-lm(Sepal.Length~Sepal.Width, data=iris)
summary(model_1)
print(predict(model_1, Sepal.Width=c(1,3,4,5)))


dim(iris)
model_1<-lm(Sepal.Length~Sepal.Width, data=iris)
summary(model_1)
print(predict(model_1,data.frame(Sepal.Width=c(1,3,4,5))))

Answer 1

当您在 lm 对象上调用 predict 时，调用的函数是 predict.lm。当你运行它喜欢：

predict(model_1, Sepal.Width=c(1,3,4,5))

您正在做的是向 Sepal.Width 提供 c(1,3,4,5) 参数或参数，predict.lm 将忽略它，因为此函数不存在此参数。

当没有新的输入数据时，您正在运行宁 predict.lm(model_1)，并取回拟合值：

table(predict(model_1) == predict(model_1, Sepal.Width=c(1,3,4,5)))

TRUE 
 150

在这种情况下，您使用公式拟合模型，predict.lm 函数需要您的数据框来重建独立或外生矩阵，矩阵与系数相乘，然后 return 您得到预测值值。

这是 predict.lm 所做的简要说明：

newdata = data.frame(Sepal.Width=c(1,3,4,5))
Terms = delete.response(terms(model_1))
X = model.matrix(Terms,newdata)

X
  (Intercept) Sepal.Width
1           1           1
2           1           3
3           1           4
4           1           5

X %*% coefficients(model_1)
      [,1]
1 6.302861
2 5.856139
3 5.632778
4 5.409417

predict(model_1,newdata)

       1        2        3        4 
6.302861 5.856139 5.632778 5.409417

使用线性模型进行预测以及 data.frame 的重要性

prediction using linear model and the importance of data.frame

r

prediction

dataframe

lm