两个 lm() 函数及其输出之间的区别
The difference between two lm() functions and their outputs
谁能告诉我两个 lm() 函数及其输出之间的区别?
lm(cars$dist ~ cars$speed)
lm(dist ~ speed, data = cars)
在拟合模型方面没有真正的区别:
> data(cars)
> m1 <- lm(cars$dist ~ cars$speed)
> m2 <- lm(dist ~ speed, data = cars)
> all.equal(m1, m2)
[1] "Component “coefficients”: Names: 1 string mismatch"
[2] "Component “effects”: Names: 1 string mismatch"
[3] "Component “qr”: Component “qr”: Attributes: < Component “dimnames”: Component 2: 1 string mismatch >"
[4] "Component “call”: target, current do not match when deparsed"
[5] "Component “terms”: formulas differ in contents"
[6] "Component “model”: Names: 2 string mismatches"
[7] "Component “model”: Attributes: < Component “terms”: formulas differ in contents >"
这些差异都是由于模型中变量的派生名称所致。
但是,第二种形式更有用。例如,从第一个模型进行预测完全是一件令人头疼的事情:
df <- with(cars, data.frame(speed = c(30, 40)))
predict(m1, newdata = df)
predict(m2, newdata = df)
> predict(m1, newdata = df)
1 2 3 4 5 6 7 8
-1.849460 -1.849460 9.947766 9.947766 13.880175 17.812584 21.744993 21.744993
9 10 11 12 13 14 15 16
21.744993 25.677401 25.677401 29.609810 29.609810 29.609810 29.609810 33.542219
17 18 19 20 21 22 23 24
33.542219 33.542219 33.542219 37.474628 37.474628 37.474628 37.474628 41.407036
25 26 27 28 29 30 31 32
41.407036 41.407036 45.339445 45.339445 49.271854 49.271854 49.271854 53.204263
33 34 35 36 37 38 39 40
53.204263 53.204263 53.204263 57.136672 57.136672 57.136672 61.069080 61.069080
41 42 43 44 45 46 47 48
61.069080 61.069080 61.069080 68.933898 72.866307 76.798715 76.798715 76.798715
49 50
76.798715 80.731124
Warning message:
'newdata' had 2 rows but variables found have 50 rows
> predict(m2, newdata = df)
1 2
100.3932 139.7173
第二个版本是正确的,从根据 m1
.
拟合的模型中获得合适的数据框进行预测并非易事
帮自己一个忙,使用带有 data
参数的第二种形式。
谁能告诉我两个 lm() 函数及其输出之间的区别?
lm(cars$dist ~ cars$speed)
lm(dist ~ speed, data = cars)
在拟合模型方面没有真正的区别:
> data(cars)
> m1 <- lm(cars$dist ~ cars$speed)
> m2 <- lm(dist ~ speed, data = cars)
> all.equal(m1, m2)
[1] "Component “coefficients”: Names: 1 string mismatch"
[2] "Component “effects”: Names: 1 string mismatch"
[3] "Component “qr”: Component “qr”: Attributes: < Component “dimnames”: Component 2: 1 string mismatch >"
[4] "Component “call”: target, current do not match when deparsed"
[5] "Component “terms”: formulas differ in contents"
[6] "Component “model”: Names: 2 string mismatches"
[7] "Component “model”: Attributes: < Component “terms”: formulas differ in contents >"
这些差异都是由于模型中变量的派生名称所致。
但是,第二种形式更有用。例如,从第一个模型进行预测完全是一件令人头疼的事情:
df <- with(cars, data.frame(speed = c(30, 40)))
predict(m1, newdata = df)
predict(m2, newdata = df)
> predict(m1, newdata = df)
1 2 3 4 5 6 7 8
-1.849460 -1.849460 9.947766 9.947766 13.880175 17.812584 21.744993 21.744993
9 10 11 12 13 14 15 16
21.744993 25.677401 25.677401 29.609810 29.609810 29.609810 29.609810 33.542219
17 18 19 20 21 22 23 24
33.542219 33.542219 33.542219 37.474628 37.474628 37.474628 37.474628 41.407036
25 26 27 28 29 30 31 32
41.407036 41.407036 45.339445 45.339445 49.271854 49.271854 49.271854 53.204263
33 34 35 36 37 38 39 40
53.204263 53.204263 53.204263 57.136672 57.136672 57.136672 61.069080 61.069080
41 42 43 44 45 46 47 48
61.069080 61.069080 61.069080 68.933898 72.866307 76.798715 76.798715 76.798715
49 50
76.798715 80.731124
Warning message:
'newdata' had 2 rows but variables found have 50 rows
> predict(m2, newdata = df)
1 2
100.3932 139.7173
第二个版本是正确的,从根据 m1
.
帮自己一个忙,使用带有 data
参数的第二种形式。