如何在 R 的 ggplot 中为 geom_point 手动添加图例? - 相同审美的两个尺度

How to add legends manually for geom_point in ggplot in R? - two scales for the same aesthetic

我正在尝试将图例添加到几个 geom_point。虽然,因为我有三个 geom_point,标签只出现在一个变量(“结果”)中。

除了“结果”变量之外,我还想显示两个菱形的标签:蓝色菱形(“TStartTime”)和绿色(“指标”)。

# Code for numbers reproduction
df <- data.frame(subjectID = factor(1:10, 10:1),
                   stage = rep(c("treated"), times = c(10)),
                   endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
                   Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
                   TStartTime=c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
                   TEndTime=c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
                   TimeZero=c(0,0,0,0,0,0,0,0,0,0),
                   ind=rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
                   Garea=c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
                   indicator=c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA))
# Code for the plot
gg <- ggplot(df, aes(subjectID, endTime)) + 
  scale_fill_manual(values = c("khaki", "orange"))  + 
  geom_col(aes(fill = factor(stage))) + 
  
  geom_point(data=df, aes(subjectID, TStartTime), colour = c("blue"), fill =alpha(c("#FAFAFA"), 0.2), shape=18, size=4) +
  coord_flip() + # blue diamond
  
  geom_point(data=df, aes(subjectID, indicator), colour = c("green"), shape=18, size=4) +
  coord_flip() + # green diamond for indicator
  
  
  geom_point(aes(colour = Outcome, shape = Outcome),  size = 4) +
  coord_flip() +
  scale_colour_manual(values = c('purple','gray'), na.translate=FALSE) + 
  scale_y_continuous(limits = c(-0.2, 15), breaks = 0:15) + 
  labs(labels= "",
       x       = "ID ", 
       fill    = "Status",
       y       = "Days",
       title   = "Plot") +
  theme_classic() 
  theme(plot.title   = element_text(hjust = 0.5),
        plot.caption = element_text(size = 7, hjust = 0))

您基本上是在为相同的审美寻找第二个尺度。 ggnewscale 是你的朋友。代码中的许多其他注释。特别是,您已多次调用 coord_flip,这是没有必要的,甚至可能是危险的。我会完全避免 coord_flip(请参阅我在代码中的评论如何做到这一点)。

撇开所有这些技术方面的问题不谈——您的可视化效果对我来说似乎不太理想,而且相当混乱。我想知道是否可能没有更直观的方式来呈现您的各种变量 - 也许考虑方面。下面是一个建议。

library(tidyverse)
library(ggnewscale)

df <- data.frame(
  subjectID = factor(1:10, 10:1),
  stage = rep(c("treated"), times = c(10)),
  endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
  Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
  TStartTime = c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
  TEndTime = c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
  TimeZero = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
  ind = rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
  Garea = c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
  indicator = c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA)
)

# pivot longer so you can combine tstarttime and indicator into one legend easily
df %>%
  pivot_longer(cols = c(TStartTime, indicator)) %>%
  # remove all the coord_flip calls (you only need one, if not none!)
  ggplot() +
  scale_fill_manual(values = c("khaki", "orange")) +
  # just change the x/y aesthetic in geom_col
  # geom_col would add all values together, so you need to use the un-pivoted data
  geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = factor(stage))) +
  # now you only need one geom_point for the new scale, but use the variable in aes()
  geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4) +
  scale_color_manual(values = c("blue", "green")) +
  # now add a new scale for the same aesthetic (color)
  new_scale_color() +
  geom_point(aes(y = subjectID, x = endTime, colour = Outcome, shape = Outcome), size = 4) +
  ## removing na.translate = FALSE avoids the duplicate legend for outcome
  scale_colour_manual(values = c("purple", "gray"))
#> Warning: Removed 12 rows containing missing values (geom_point).
#> Warning: Removed 8 rows containing missing values (geom_point).

可视化较少的维度/变量有时会更好。这里有一个建议,如何避免相同审美的双重尺度,并可能更有说服力地使用你的颜色。我觉得条形图的使用可能也不理想,但这实际上取决于变量“indicator/ttimestart”是什么以及它与结束时间的关系。一个好的目标是显示这两个变量之间的关系。

df %>%
  pivot_longer(cols = c(TStartTime, indicator)) %>%
  ggplot() +
  ## all of them are treated, so I am using Outcome as fill variable
  # this removes the need for second geom-point and second scale
  geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = Outcome)) +
  scale_fill_manual(values = c("purple", "gray")) +
  geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4) +
  scale_color_manual(values = c("blue", "green")) +
## if you have untreated people, show them in a new facet, e.g., add 
  facet_grid(~stage)
#> Warning: Removed 12 rows containing missing values (geom_point).

reprex package (v2.0.1)

于 2022-05-05 创建