当 geom_hline 在颜色图例中有一个单独的(附加)条目时,如何合并颜色和形状的图例?

How to merge legends for color and shape when geom_hline has a separate (additional) entry in the color legend?

我有以下代码,生成以下图:

cols <- brewer.pal(n = 3, name = 'Dark2')

p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() + 
  geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) + 
  geom_line(position=pd) + 
  geom_point(aes(shape=Approach, colour = Approach), size = 4) + 
  geom_hline(aes(yintercept = cp.best$slope, colour = "C2P"), show_guide = FALSE) + 
  scale_color_manual(name="Approach", breaks=c("C2P", "P2P", "CP2P"), values =  cols[c(1,3,2)]) + 
  scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
  scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal', 
      legend.position = 'top', 
      plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"), 
      text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))   
p4 <- p4 + guides(colour=guide_legend(override.aes=list(shape=c(NA,17,16))))

p4

当我在geom_point中尝试show_guide = FALSE时,上面图例中的点的形状都设置为默认实心圆。

如何让下方的图例消失,而不影响上方的图例?

这是一个解决方案,包含可重现的数据:

library("ggplot2")
library("grid")
library("RColorBrewer")

cp2p <- data.frame(xval = 10 * 2:15, yval = cumsum(c(0.55, rnorm(13, 0.01, 0.005))), Approach = "CP2P", stringsAsFactors = FALSE)
p2p <- data.frame(xval = 10 * 1:15, yval = cumsum(c(0.7, rnorm(14, 0.01, 0.005))), Approach = "P2P", stringsAsFactors = FALSE)

pd <- position_dodge(0.1)
cp.best <- list(slope = 0.65)

all.m <- rbind(p2p, cp2p)
all.m$Approach <- factor(all.m$Approach, levels = c("C2P", "P2P", "CP2P"))
all.m$se <- rnorm(29, 0.1, 0.02)
all.m[nrow(all.m) + 1, ] <- all.m[nrow(all.m) + 1, ] # Creates a new row filled with NAs
all.m$Approach[nrow(all.m)] <- "C2P"
cols <- brewer.pal(n = 3, name = 'Dark2')

p4 <- ggplot(all.m, aes(x=xval, y=yval, colour = Approach, ymax = 0.95)) + theme_bw() + 
  geom_errorbar(aes(ymin= yval - se, ymax = yval + se), width=5, position=pd) + 
  geom_line(position=pd) + 
  geom_point(aes(shape=Approach, colour = Approach), size = 4, na.rm = TRUE) + 
  geom_hline(aes(yintercept = cp.best$slope, colour = "C2P")) + 
  scale_color_manual(values = c(C2P = cols[1], P2P = cols[2], CP2P = cols[3])) + 
  scale_shape_manual(values = c(C2P = NA, P2P = 16, CP2P = 17)) +
  scale_y_continuous(breaks = seq(0.4, 0.95, 0.05), "Test AUROC") +
  scale_x_continuous(breaks = seq(10, 150, by = 20), "# Number of Patient Samples in Training")
p4 <- p4 + theme(legend.direction = 'horizontal', 
                 legend.position = 'top', 
                 plot.margin = unit(c(5.1, 7, 4.5, 3.5)/2, "lines"), 
                 text = element_text(size=15), axis.title.x=element_text(vjust=-1.5), axis.title.y=element_text(vjust=2))   
p4

诀窍是确保 all.m$Approach 的所有所需级别都出现在 all.m 中,即使其中一个级别从图表中删除。 geom_point.

na.rm = TRUE 参数抑制了有关省略点的警告

简答:
只需添加一个虚拟 geom_point 层(透明点),其中 shape 映射到与 geom_hline 相同的 level

geom_point(aes(shape = "int"), alpha = 0) 

更长的答案:
只要有可能,ggplot 就会合并/组合不同 aes 理论的传说。例如,如果 colourshape 映射到同一个变量,则两个图例合并为一个。

我使用带有 'x'、'y' 和分组变量 'grp' 的简单数据集来说明这一点,该变量具有两个级别:

df <- data.frame(x = rep(1:2, 2), y = 1:4, grp = rep(c("a", "b"), each = 2))

首先我们将colorshape映射到'grp'

ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
  geom_line() +
  geom_point(size = 4)

很好,aesthetics 的传说,colorshape,合并为一个。

然后我们加一个geom_hline。我们希望它具有与 geom_lines 不同的颜色以显示在图例中。因此,我们 map color 到一个变量,即把 color 放在 geom_hlineaes 里面。在这种情况下,我们不会将颜色映射到数据集中的变量,而是映射到一个常量。我们可以给常量一个想要的名字,这样我们之后就不需要重命名图例条目了。

ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
  geom_line() +
  geom_point(size = 4) +
  geom_hline(aes(yintercept = 2.5, color = "int"))

现在出现了两个传说,一个是coloraesgeom_linegeom_hline的神学,一个是shapegeom_point秒。原因是color映射到的"variable"现在包含了三个层次:原始数据中'grp'的两个层次,geom_hline aes中引入的级别'int'。因此,color 比例尺中的级别不同于 shape 比例尺中的级别,默认情况下 ggplot 无法将两个比例尺合并为一个图例。

如何将两个图例结合起来?

一种可能性 是为 shape 引入与 color 相同的附加层,方法是使用具有透明的虚拟 geom_point 层点 (alpha = 0) 以便两个 aesthetics 包含相同的水平:

ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
  geom_line() +
  geom_point(size = 4) +
  geom_hline(aes(yintercept = 2.5, color = "int")) +
  geom_point(aes(shape = "int"), alpha = 0) # <~~~~ a blank geom_point

另一种可能是将原来的分组变量转换为factor,并在原来的水平上加上“geom_hline水平”。然后在scale_shape_discrete中使用drop = FALSE包含"unused factor levels from the scale":

datadf$grp <- factor(df$grp, levels = c(unique(df$grp), "int"))

ggplot(data = df, aes(x = x, y = y, color = grp, shape = grp)) +
  geom_line() +
  geom_point(size = 4) +
  geom_hline(aes(yintercept = 2.5, color = "int")) +
  scale_shape_discrete(drop = FALSE)

然后,正如你已经知道的那样,你可以使用guides函数来“override”图例中的shape aesthetics,并删除形状从 geom_hline 条目将其设置为 NA:

guides(colour = guide_legend(override.aes = list(shape = c(16, 17, NA))))