生成漂亮的线性回归图(拟合线、置信度/预测带等)
Produce nice linear regression plot (fitted line, confidence / prediction bands, etc)
我有这个样本未来 10 年的回归。
date<-as.Date(c("2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31"))
value<-c(16348, 14136, 12733, 10737)
#fit linear regression
model<-lm(value~date)
#build predict dataframe
dfuture<-data.frame(date=seq(as.Date("2016-12-31"), by="1 year", length.out = 10))
#predict the futurne
predict(model, dfuture, interval = "prediction")
我如何为此添加置信区间?
下面的代码将为您生成好看的回归图。我对代码的评论应该清楚地解释一切。该代码将使用 value
、model
作为您的问题。
## all date you are interested in, 4 years with observations, 10 years for prediction
all_date <- seq(as.Date("2012-12-31"), by="1 year", length.out = 14)
## compute confidence bands (for all data)
pred.c <- predict(model, data.frame(date=all_date), interval="confidence")
## compute prediction bands (for new data only)
pred.p <- predict(model, data.frame(date=all_date[5:14]), interval="prediction")
## set up regression plot (plot nothing here; only set up range, axis)
ylim <- range(range(pred.c[,-1]), range(pred.p[,-1]))
plot(1:nrow(pred.c), numeric(nrow(pred.c)), col = "white", ylim = ylim,
xaxt = "n", xlab = "Date", ylab = "prediction",
main = "Regression Plot")
axis(1, at = 1:nrow(pred.c), labels = all_date)
## shade 95%-level confidence region
polygon(c(1:nrow(pred.c),nrow(pred.c):1), c(pred.c[, 2], rev(pred.c[, 3])),
col = "grey", border = NA)
## plot fitted values / lines
lines(1:nrow(pred.c), pred.c[, 1], lwd = 2, col = 4)
## add 95%-level confidence bands
lines(1:nrow(pred.c), pred.c[, 2], col = 2, lty = 2, lwd = 2)
lines(1:nrow(pred.c), pred.c[, 3], col = 2, lty = 2, lwd = 2)
## add 95%-level prediction bands
lines(4 + 1:nrow(pred.p), pred.p[, 2], col = 3, lty = 3, lwd = 2)
lines(4 + 1:nrow(pred.p), pred.p[, 3], col = 3, lty = 3, lwd = 2)
## add original observations on the plot
points(1:4, rev(value), pch = 20)
## finally, we add legend
legend(x = "topleft", legend = c("Obs", "Fitted", "95%-CI", "95%-PI"),
pch = c(20, NA, NA, NA), lty = c(NA, 1, 2, 3), col = c(1, 4, 2, 3),
text.col = c(1, 4, 2, 3), bty = "n")
JPEG 由代码生成:
jpeg("regression.jpeg", height = 500, width = 600, quality = 100)
## the above code
dev.off()
## check your working directory for this JPEG
## use code getwd() to see this director if you don't know
从剧情可以看出,
- 当您尝试根据观察到的数据进行预测时,置信区间会变宽;
- 预测区间比置信区间宽。
如果您想详细了解 predict.lm()
如何在内部计算置信区间/预测区间,请阅读 ,以及我的回答。
感谢Alex对visreg
包简单使用的演示;但我还是更喜欢使用 R base.
您可以简单地使用 visreg::visreg
library(visreg)
visreg(model)
如果您对这些值感兴趣:
> head(visreg(model)$fit)
date value visregFit visregLwr visregUpr
1 2012-12-31 13434.5 10753.10 9909.073 11597.13
2 2013-01-10 13434.5 10807.81 9974.593 11641.02
3 2013-01-21 13434.5 10862.52 10040.033 11685.00
4 2013-02-01 13434.5 10917.22 10105.389 11729.06
5 2013-02-12 13434.5 10971.93 10170.658 11773.21
6 2013-02-23 13434.5 11026.64 10235.837 11817.44
我有这个样本未来 10 年的回归。
date<-as.Date(c("2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31"))
value<-c(16348, 14136, 12733, 10737)
#fit linear regression
model<-lm(value~date)
#build predict dataframe
dfuture<-data.frame(date=seq(as.Date("2016-12-31"), by="1 year", length.out = 10))
#predict the futurne
predict(model, dfuture, interval = "prediction")
我如何为此添加置信区间?
下面的代码将为您生成好看的回归图。我对代码的评论应该清楚地解释一切。该代码将使用 value
、model
作为您的问题。
## all date you are interested in, 4 years with observations, 10 years for prediction
all_date <- seq(as.Date("2012-12-31"), by="1 year", length.out = 14)
## compute confidence bands (for all data)
pred.c <- predict(model, data.frame(date=all_date), interval="confidence")
## compute prediction bands (for new data only)
pred.p <- predict(model, data.frame(date=all_date[5:14]), interval="prediction")
## set up regression plot (plot nothing here; only set up range, axis)
ylim <- range(range(pred.c[,-1]), range(pred.p[,-1]))
plot(1:nrow(pred.c), numeric(nrow(pred.c)), col = "white", ylim = ylim,
xaxt = "n", xlab = "Date", ylab = "prediction",
main = "Regression Plot")
axis(1, at = 1:nrow(pred.c), labels = all_date)
## shade 95%-level confidence region
polygon(c(1:nrow(pred.c),nrow(pred.c):1), c(pred.c[, 2], rev(pred.c[, 3])),
col = "grey", border = NA)
## plot fitted values / lines
lines(1:nrow(pred.c), pred.c[, 1], lwd = 2, col = 4)
## add 95%-level confidence bands
lines(1:nrow(pred.c), pred.c[, 2], col = 2, lty = 2, lwd = 2)
lines(1:nrow(pred.c), pred.c[, 3], col = 2, lty = 2, lwd = 2)
## add 95%-level prediction bands
lines(4 + 1:nrow(pred.p), pred.p[, 2], col = 3, lty = 3, lwd = 2)
lines(4 + 1:nrow(pred.p), pred.p[, 3], col = 3, lty = 3, lwd = 2)
## add original observations on the plot
points(1:4, rev(value), pch = 20)
## finally, we add legend
legend(x = "topleft", legend = c("Obs", "Fitted", "95%-CI", "95%-PI"),
pch = c(20, NA, NA, NA), lty = c(NA, 1, 2, 3), col = c(1, 4, 2, 3),
text.col = c(1, 4, 2, 3), bty = "n")
JPEG 由代码生成:
jpeg("regression.jpeg", height = 500, width = 600, quality = 100)
## the above code
dev.off()
## check your working directory for this JPEG
## use code getwd() to see this director if you don't know
从剧情可以看出,
- 当您尝试根据观察到的数据进行预测时,置信区间会变宽;
- 预测区间比置信区间宽。
如果您想详细了解 predict.lm()
如何在内部计算置信区间/预测区间,请阅读
感谢Alex对visreg
包简单使用的演示;但我还是更喜欢使用 R base.
您可以简单地使用 visreg::visreg
library(visreg)
visreg(model)
如果您对这些值感兴趣:
> head(visreg(model)$fit)
date value visregFit visregLwr visregUpr
1 2012-12-31 13434.5 10753.10 9909.073 11597.13
2 2013-01-10 13434.5 10807.81 9974.593 11641.02
3 2013-01-21 13434.5 10862.52 10040.033 11685.00
4 2013-02-01 13434.5 10917.22 10105.389 11729.06
5 2013-02-12 13434.5 10971.93 10170.658 11773.21
6 2013-02-23 13434.5 11026.64 10235.837 11817.44