ggplot2:绘制具有不同截距但具有相同斜率的回归线
ggplot2: Plotting regression lines with different intercepts but with same slope
我想绘制截距不同但斜率相同的回归线。
使用以下 ggplot2
代码,我可以绘制具有不同截距和不同斜率的回归线。但是想不出如何绘制具有不同截距但相同斜率的回归线。
library(ggplot2)
ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() +
geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))
Consumption <- c(51, 52, 53, 54, 56, 57, 55, 56, 58, 59, 62, 63)
Gender <- gl(n = 2, k = 6, length = 2*6, labels = c("Male", "Female"), ordered = FALSE)
Income <- rep(x=c(80, 90, 100), each=2)
df3 <- data.frame(Consumption, Gender, Income)
df3
# Regression with same slope but different intercepts for each Gender
fm1 <- lm(formula=Consumption~Gender+Income, data=df3)
summary(fm1)
Call:
lm(formula = Consumption ~ Gender + Income, data = df3)
Residuals:
Min 1Q Median 3Q Max
-0.8333 -0.8333 0.1667 0.1667 1.1667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.83333 2.54557 10.54 2.30e-06 ***
GenderFemale 5.00000 0.45812 10.91 1.72e-06 ***
Income 0.30000 0.02805 10.69 2.04e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7935 on 9 degrees of freedom
Multiple R-squared: 0.9629, Adjusted R-squared: 0.9546
F-statistic: 116.7 on 2 and 9 DF, p-value: 3.657e-07
为什么不用 lm
的结果在 ggplot 之外计算回归:
# Regression with same slope but different intercepts for each Gender
fm1 <- lm(formula=Consumption~Gender+Income, data=df3)
df3 = cbind(df3, pred = predict(fm1))
ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() +
geom_line(mapping=aes(y=pred))
产生相同的斜率和不同的截距:
从技术上讲,正如您在模型中看到的那样,没有两个不同的截距,而是对虚拟变量 GenderFemale
的额外偏移。
编辑:包括 predict
以简化,感谢@aosmith 的建议。
我想绘制截距不同但斜率相同的回归线。
使用以下 ggplot2
代码,我可以绘制具有不同截距和不同斜率的回归线。但是想不出如何绘制具有不同截距但相同斜率的回归线。
library(ggplot2)
ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() +
geom_smooth(data=df3, method = "lm", se=FALSE, mapping=aes(x=Income, y=Consumption))
Consumption <- c(51, 52, 53, 54, 56, 57, 55, 56, 58, 59, 62, 63)
Gender <- gl(n = 2, k = 6, length = 2*6, labels = c("Male", "Female"), ordered = FALSE)
Income <- rep(x=c(80, 90, 100), each=2)
df3 <- data.frame(Consumption, Gender, Income)
df3
# Regression with same slope but different intercepts for each Gender
fm1 <- lm(formula=Consumption~Gender+Income, data=df3)
summary(fm1)
Call:
lm(formula = Consumption ~ Gender + Income, data = df3)
Residuals:
Min 1Q Median 3Q Max
-0.8333 -0.8333 0.1667 0.1667 1.1667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.83333 2.54557 10.54 2.30e-06 ***
GenderFemale 5.00000 0.45812 10.91 1.72e-06 ***
Income 0.30000 0.02805 10.69 2.04e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7935 on 9 degrees of freedom
Multiple R-squared: 0.9629, Adjusted R-squared: 0.9546
F-statistic: 116.7 on 2 and 9 DF, p-value: 3.657e-07
为什么不用 lm
的结果在 ggplot 之外计算回归:
# Regression with same slope but different intercepts for each Gender
fm1 <- lm(formula=Consumption~Gender+Income, data=df3)
df3 = cbind(df3, pred = predict(fm1))
ggplot(data=df3, mapping=aes(x=Income, y=Consumption, color=Gender)) + geom_point() +
geom_line(mapping=aes(y=pred))
产生相同的斜率和不同的截距:
从技术上讲,正如您在模型中看到的那样,没有两个不同的截距,而是对虚拟变量 GenderFemale
的额外偏移。
编辑:包括 predict
以简化,感谢@aosmith 的建议。