特定点的线性回归

Linear Regression on Specific Points

如何根据具有相同颜色的彩色点绘制 4 条线性回归线(线也使用相同的颜色)?

当前图表:

代码:

注:

y_tilde = 每天吸的香烟数

x_1 = 父亲年龄

x_2 = 受教育年限

print(ggplot(data=df, mapping=aes(y=y_tilde, x=x_1, col=x_2)) 
  + xlab("Father's Age")
  + ylab("Cigarettes Smoked Per Day")
  + labs(color="Years of Education") 
  + geom_point(aes(colour = cut(x_2, breaks=c(-Inf, 10, 12, 14, 16), labels=c(10, 12, 14, 16)))))

首先从查看 ?aes 帮助页面开始。 ggplot(data=df, mapping=aes(y=y_tilde, x=x_1, col=x_2)) 部分不需要在 aes 中指定 col 参数,因为您正在为每个单独的 geom.

设置它
aa <- airquality
aa$cut <- factor(aa$Month)

带有一个全局示例数据的示例 aes:

print(ggplot(data = aa, mapping = aes(y = Ozone, x = Temp, col = cut)) +
      xlab("Father's Age") +
      ylab("Cigarettes Smoked Per Day") +
      labs(color = "Years of Education") +
      geom_point() +
      geom_smooth(method = "lm"))

示例数据,每个 geom 有一个 aes:

print(ggplot(data = aa, mapping = aes(y = Ozone, x = Temp)) +
        xlab("Father's Age") +
        ylab("Cigarettes Smoked Per Day") +
        labs(color = "Years of Education") +
        geom_point(aes(col = cut)) +
        geom_smooth(method = "lm", aes(col = cut)))

@BenBolker 指出第二件事是使用 DRY 规则 https://pl.wikipedia.org/wiki/DRY,在绘图定义上定义 cut 变量一次。其他解决方案是将 cut 变量用作全局变量 aes,我也会定义这样的变量,以便更清楚。

如果您也想更新回归区间的颜色,请添加到 aes 参数 fill = cut。我也把这里的labs改了。

print(ggplot(data = aa, mapping = aes(y = Ozone, x = Temp)) +
        xlab("Father's Age") +
        ylab("Cigarettes Smoked Per Day") +
        labs(color = "Years of Education", fill = "Years of Education") +
        geom_point(aes(col = cut)) +
        geom_smooth(method = "lm", aes(col = cut, fill = cut)))