在 R 中绘制分类变量 OLS

Plotting categorical variables OLS in R

我正在尝试生成一个图,其中 x 轴为年龄,y 轴为预期血清尿酸盐,male/white、female/white、male/black 的线条female/black,使用 lm() 函数的估计值。

goutdata <- read.table("gout.txt", header = TRUE)
goutdata$sex <- factor(goutdata$sex,levels = c("M",  "F"))
goutdata$race <- as.factor(goutdata$race)

fm <- lm(su~sex+race+age, data = goutdata)
summary(fm)
ggplot(fm, aes(x= age, y = su))+xlim(30, 70) + geom_jitter(aes(age,su, colour=age)) + facet_grid(sex~race)

我曾尝试将 facet_wrap() 函数与 ggplot 结合使用来处理分类变量,但我只想创建一个图。我正在尝试 geom_jitter 和 geom_smooth 的组合,但我不确定如何将 geom_smooth() 与分类变量一起使用。任何帮助,将不胜感激。

数据:https://github.com/gdlc/STT465/blob/master/gout.txt

您或许可以使用 geom_smooth() 来显示回归线?

dat <- read.table("https://raw.githubusercontent.com/gdlc/STT465/master/gout.txt", 
                   header = T, stringsAsFactors = F)

library(tidyverse) 

dat %>%
  dplyr::mutate(sex = ifelse(sex == "M", "Male", "Female"),
                race = ifelse(race == "W", "Caucasian", "African-American"),
                group = paste(race, sex, sep = ", ")
                ) %>%
  ggplot(aes(x = age, y = su, colour = group)) +
  geom_smooth(method = "lm", se = F, show.legend = F) +
  geom_point(show.legend = F, position = "jitter", alpha = .5, pch = 16) +
  facet_wrap(~group) +
  ggthemes::theme_few() +
  labs(x = "Age", y = "Expected serum urate level")

我们可以使用 interaction() 动态创建分组并在 geom_smooth() 内执行 OLS。在这里,它们被分组在一个地块上:

ggplot(goutdata, aes(age, su, color = interaction(sex, race))) +
  geom_smooth(formula = y~x, method="lm") +
  geom_point() +
  hrbrthemes::theme_ipsum_rc(grid="XY")

然后,展开到各个方面:

ggplot(goutdata, aes(age, su, color = interaction(sex, race))) +
  geom_smooth(formula = y~x, method="lm") +
  geom_point() +
  facet_wrap(sex~race) +
  hrbrthemes::theme_ipsum_rc(grid="XY")

您现在已经部分回答了 https://github.com/gdlc/STT465/blob/master/HW_4_OLS.md 中的第 1 个问题:-)