如果栏的名称类别是字符,如何使用 geom_bar 连接堆积栏比例

How to use geom_bar to connect stacked-bar proportions if name categorial for bar is character

这是对先前找到的问题的回答的扩展

简要地@Jon Spring使用以下示例代码生成堆叠条形图,其中用线连接两组之间的每个条形比例:

library(ggplot2)
set.seed(0)
data_bar <- data.frame(
  stringsAsFactors = F,
  Sample = rep(c("A", "B"), each = 10),
  Percentage = runif(20),
  Taxon = rep(1:10, by = 2)
)
library(tidyr)
ggplot() +
  geom_bar(data = data_bar,
           aes(x = Sample, y =Percentage, fill = Taxon),
           colour = 'white', width = 0.3, stat="identity") +
  geom_segment(data = tidyr::spread(data_bar, Sample, Percentage),
               colour = "white",
               aes(x = 1 + 0.3/2,
                   xend = 2 - 0.3/2,
                   y = cumsum(A),
                   yend = cumsum(B))) +
  theme(panel.background = element_rect(fill = "black"), # to make connecting points          
        panel.grid = element_blank())   

geom_seg example

虽然这是一段优雅的代码来解决连接条形比例的问题,但是一旦条形比例名称是字符串而不是整数,我就无法重现它,就像上面那样。这是我的代码:

test.matrix<-matrix(c(70,120,65,140,13,68,46,294,52,410),ncol=2,byrow=TRUE)
rownames(test.matrix)<-c("BC.1","BC.2","GC","MO","EB")
colnames(test.matrix)<-c("12m","3m")
test.matrix <- data.frame(test.matrix)

ggplot() +
  geom_bar(data = test.matrix,
           aes(x = Var2, y =Freq, fill = Var1),
           colour = 'black', width = 0.3, stat="identity") +
  geom_segment(data = tidyr::spread(test.matrix, Var2, Freq),
               colour = "black",
               aes(x = 1 + 0.3/2,
                   xend = 2 - 0.3/2,
                   y = cumsum(`12m`),
                   yend = cumsum(`3m`))) +
  scale_fill_manual(values=c('BC.1'="gold",'BC.2'="yellowgreen",'GC'="navy",'MO'="royalblue",'EB'="orangered")) +
  theme(panel.background = element_rect(fill = "white"), panel.grid = element_blank())

geom_seg char

结果与 geom_segment 条线的比例不匹配。也许它与 cumsum() 使用字符串的字母顺序有关,但我不知道如何解决这个问题 - 或者它完全不同......

所以我有两个问题:

  1. 如果必须固定比例顺序,如何连接条形比例? (每个值组或行的字符串向量或因子 'names')

  2. 如何在每个条形的最底部生成一个额外的 geom_segment,将每个条形的两个下端与另一个连接起来?

  1. 问题是您 cumsum输入错误的“方向”或顺序,即您在条形图中从 BC.1 开始 cumsum输入它在顶部。这可以通过在累积之前重新排列数据集来简单地解决。因此,我认为最好在绘图代码之外执行此操作,以便您可以轻松检查数据。

  2. 要在底部再添加一个 geom_segment,您只需在数据中添加一行即可。

library(tidyverse)

test.matrix<-matrix(c(70,120,65,140,13,68,46,294,52,410),ncol=2,byrow=TRUE)
rownames(test.matrix)<-c("BC.1","BC.2","GC","MO","EB")
colnames(test.matrix)<-c("12m","3m")
test.matrix <- data.frame(test.matrix)

test.matrix <- test.matrix %>% 
  setNames(c("12m", "3m")) %>% 
  rownames_to_column(var = "Var1") %>% 
  pivot_longer(-Var1, names_to = "Var2", values_to = "Freq")

test.matrix.wide <- tidyr::spread(test.matrix, Var2, Freq) %>% 
  arrange(desc(Var1)) %>% 
  mutate(y = cumsum(`12m`),
         yend = cumsum(`3m`)) %>% 
  add_row(y = 0, yend = 0)

ggplot() +
  geom_bar(data = test.matrix,
           aes(x = Var2, y =Freq, fill = Var1),
           colour = 'black', width = 0.3, stat="identity") +
  geom_segment(data = test.matrix.wide,
               colour = "black",
               aes(x = 1 + 0.3/2,
                   xend = 2 - 0.3/2,
                   y = y,
                   yend = yend)) +
  scale_fill_manual(values=c('BC.1'="gold",'BC.2'="yellowgreen",'GC'="navy",'MO'="royalblue",'EB'="orangered")) +
  theme(panel.background = element_rect(fill = "white"), panel.grid = element_blank())