将总观察数 (n) 放在 ggplot 中堆叠百分比条形图的顶部

put total observation number (n) on top of stacked percentage barplot in ggplot

我在 ggplot 中有一个堆叠百分比条形图,我想将总观察数放在堆叠条形图的顶部(同时保持堆叠条形图为百分比)。然而,我一直 运行 陷入困境。

下面是我生成百分比条形图的代码:

# sample dataset
    set.seed(123)
    cat1<-sample(letters[1:3], 500, replace=T, prob=c(0.1, 0.2, 0.65))
    cat2<-sample(letters[4:8], 500, replace=T, prob=c(0.3, 0.4, 0.75, 0.5, 0.1))
    df <- data.frame(cat1, cat2)

# the barplot
    ggplot(df, aes(x=cat1))+
    geom_bar(aes(fill = cat2),
                    position = 'fill',color = "black")+
    scale_y_continuous(labels = scales::percent)+
    labs ( y = "Percentage")+
      # this final line is me trying to add the label
      geom_text(aes(label=cat1))

# this is the observation number I want display
    table(df$cat1)

    # but I get this error:
Error: geom_text requires the following missing aesthetics: y

所以我有两个问题:

  1. 如何将每个 cat1 "N=" 标签的总观察数放在每个堆叠条的顶部)?
  2. 我的代码中条形图的 y 到底是什么 (aes(x=...))?我有 x,但没有 y,但情节似乎有效..

谢谢!

你可以试试

temp <- data.frame(x=c("a", "b", "c"), y=c(1.02, 1.02, 1.02), z=c(51, 101, 348))

   ggplot(df, aes(x=cat1))+
    geom_bar(aes(fill = cat2),
                    position = 'fill',color = "black")+
    scale_y_continuous(labels = scales::percent)+
    labs ( y = "Percentage")+
      # this final line is me trying to add the label
      geom_text(data=temp, aes(x=x, y=y, label=as.factor(z)))

如果您不想对摘要标签进行硬编码,可以使用 dplyr 来计算百分比和设置标签格式,这是一种略有不同的方法(但仍然有点 hack)。

我还颠倒了您的图例以匹配图表上的顺序:)

library(dplyr)

df2 <- df %>%
  group_by(cat1, cat2) %>%
  summarise(n=n())%>%
  mutate(percent = (n / sum(n)), cumsum = cumsum(percent), label=ifelse(cat2=="h", paste0("N=", sum(n)),""))

  ggplot(df2,aes(x=cat1, y=percent, fill=cat2)) +
    scale_y_continuous(labels = scales::percent) +
    labs ( y = "Percentage") +
    geom_bar(position = 'fill',color = "black", stat="identity") +
    geom_text(aes(y=cumsum, label=label), vjust=-1) +
    guides(fill=guide_legend(reverse=T))