如何为二次模型创建预测线
How to create prediction line for Quadratic Model
我正在尝试为二次模型创建二次预测线。我正在使用 R 附带的 Auto 数据集。我可以毫不费力地为线性模型创建预测线。然而,二次模型会产生看起来很奇怪的线条。这是我的代码。
# Linear Model
plot(Auto$horsepower, Auto$mpg,
main = "MPG versus Horsepower",
pch = 20)
lin_mod = lm(mpg ~ horsepower,
data = Auto)
lin_pred = predict(lin_mod)
lines(
Auto$horsepower, lin_pred,
col = "blue", lwd = 2
)
# The Quadratic model
Auto$horsepower2 = Auto$horsepower^2
quad_model = lm(mpg ~ horsepower2,
data = Auto)
quad_pred = predict(quad_model)
lines(
Auto$horsepower,
quad_pred,
col = "red", lwd = 2
)
我 99% 确定问题出在预测函数上。为什么我不能生成看起来很整洁的二次预测曲线?我试过的以下代码不起作用——这可能是相关的吗?:
quad_pred = predict(quad_model, data.frame(horsepower = Auto$horsepower))
谢谢!
问题是 x-axis
值未排序。如果是线性模型并不重要,但如果它是多项式的话就会很明显。我创建了一个新的排序数据集,它工作正常:
library(ISLR) # To load data Auto
# Linear Model
plot(Auto$horsepower, Auto$mpg,
main = "MPG versus Horsepower",
pch = 20)
lin_mod = lm(mpg ~ horsepower,
data = Auto)
lin_pred = predict(lin_mod)
lines(
Auto$horsepower, lin_pred,
col = "blue", lwd = 2
)
# The Quadratic model
Auto$horsepower2 = Auto$horsepower^2
# Sorting Auto by horsepower2
Auto2 <- Auto[order(Auto$horsepower2), ]
quad_model = lm(mpg ~ horsepower2,
data = Auto2)
quad_pred = predict(quad_model)
lines(
Auto2$horsepower,
quad_pred,
col = "red", lwd = 2
)
一个选项是创建要为其绘制拟合线的 x 值序列。如果您的数据具有 "gap" 或者如果您希望绘制超出 x 变量范围的拟合线,这将很有用。
# load dataset; if necessary run install.packages("ISLR")
data(Auto, package = "ISLR")
# since only 2 variables at issue, use short names
mpg <- Auto$mpg
hp <- Auto$horsepower
# fit linear and quadratic models
lmod <- lm(mpg ~ hp)
qmod <- lm(mpg ~ hp + I(hp^2))
# plot the data
plot(x=hp, y=mpg, pch=20)
# use predict() to find coordinates of points to plot
x_coords <- seq(from=floor(min(hp)), to=ceiling(max(hp)), by=1)
y_coords_lmod <- predict(lmod, newdata=data.frame(hp=x_coords))
y_coords_qmod <- predict(qmod, newdata=data.frame(hp=x_coords))
# alternatively, calculate this manually using the fitted coefficients
y_coords_lmod <- coef(lmod)[1] + coef(lmod)[2]*x_coords
y_coords_qmod <- coef(qmod)[1] + coef(qmod)[2]*x_coords + coef(qmod)[3]*x_coords^2
# add the fitted lines to the plot
points(x=x_coords, y=y_coords_lmod, type="l", col="blue")
points(x=x_coords, y=y_coords_qmod, type="l", col="red")
或者,使用 ggplot2
:
ggplot(Auto, aes(x = horsepower, y = mpg)) + geom_point() +
stat_smooth(aes(x = horsepower, y = mpg), method = "lm", formula = y ~ x, colour = "red") +
stat_smooth(aes(x = horsepower, y = mpg), method = "lm", formula = y ~ poly(x, 2), colour = "blue")
我正在尝试为二次模型创建二次预测线。我正在使用 R 附带的 Auto 数据集。我可以毫不费力地为线性模型创建预测线。然而,二次模型会产生看起来很奇怪的线条。这是我的代码。
# Linear Model
plot(Auto$horsepower, Auto$mpg,
main = "MPG versus Horsepower",
pch = 20)
lin_mod = lm(mpg ~ horsepower,
data = Auto)
lin_pred = predict(lin_mod)
lines(
Auto$horsepower, lin_pred,
col = "blue", lwd = 2
)
# The Quadratic model
Auto$horsepower2 = Auto$horsepower^2
quad_model = lm(mpg ~ horsepower2,
data = Auto)
quad_pred = predict(quad_model)
lines(
Auto$horsepower,
quad_pred,
col = "red", lwd = 2
)
我 99% 确定问题出在预测函数上。为什么我不能生成看起来很整洁的二次预测曲线?我试过的以下代码不起作用——这可能是相关的吗?:
quad_pred = predict(quad_model, data.frame(horsepower = Auto$horsepower))
谢谢!
问题是 x-axis
值未排序。如果是线性模型并不重要,但如果它是多项式的话就会很明显。我创建了一个新的排序数据集,它工作正常:
library(ISLR) # To load data Auto
# Linear Model
plot(Auto$horsepower, Auto$mpg,
main = "MPG versus Horsepower",
pch = 20)
lin_mod = lm(mpg ~ horsepower,
data = Auto)
lin_pred = predict(lin_mod)
lines(
Auto$horsepower, lin_pred,
col = "blue", lwd = 2
)
# The Quadratic model
Auto$horsepower2 = Auto$horsepower^2
# Sorting Auto by horsepower2
Auto2 <- Auto[order(Auto$horsepower2), ]
quad_model = lm(mpg ~ horsepower2,
data = Auto2)
quad_pred = predict(quad_model)
lines(
Auto2$horsepower,
quad_pred,
col = "red", lwd = 2
)
一个选项是创建要为其绘制拟合线的 x 值序列。如果您的数据具有 "gap" 或者如果您希望绘制超出 x 变量范围的拟合线,这将很有用。
# load dataset; if necessary run install.packages("ISLR")
data(Auto, package = "ISLR")
# since only 2 variables at issue, use short names
mpg <- Auto$mpg
hp <- Auto$horsepower
# fit linear and quadratic models
lmod <- lm(mpg ~ hp)
qmod <- lm(mpg ~ hp + I(hp^2))
# plot the data
plot(x=hp, y=mpg, pch=20)
# use predict() to find coordinates of points to plot
x_coords <- seq(from=floor(min(hp)), to=ceiling(max(hp)), by=1)
y_coords_lmod <- predict(lmod, newdata=data.frame(hp=x_coords))
y_coords_qmod <- predict(qmod, newdata=data.frame(hp=x_coords))
# alternatively, calculate this manually using the fitted coefficients
y_coords_lmod <- coef(lmod)[1] + coef(lmod)[2]*x_coords
y_coords_qmod <- coef(qmod)[1] + coef(qmod)[2]*x_coords + coef(qmod)[3]*x_coords^2
# add the fitted lines to the plot
points(x=x_coords, y=y_coords_lmod, type="l", col="blue")
points(x=x_coords, y=y_coords_qmod, type="l", col="red")
或者,使用 ggplot2
:
ggplot(Auto, aes(x = horsepower, y = mpg)) + geom_point() +
stat_smooth(aes(x = horsepower, y = mpg), method = "lm", formula = y ~ x, colour = "red") +
stat_smooth(aes(x = horsepower, y = mpg), method = "lm", formula = y ~ poly(x, 2), colour = "blue")