将标签添加到百分比堆叠条形图ggplot2

Adding labels to percentage stacked barplot ggplot2

我是 ggplot 的新手,希望为我正在为其制作可视化的数据集获得一些帮助。

这是我当前的代码:

#create plot
plot <- ggplot(newDoto, aes(y = pid3lean, weight = weight, fill = factor(Q29_1String, levels = c("Strongly disagree","Somewhat disagree", "Neither agree nor disagree", "Somewhat agree", "Strongly agree")))) + geom_bar(position = "fill", width = .732) 
#fix colors
plot <- plot + scale_fill_manual(values = c("Strongly disagree" = "#7D0000", "Somewhat disagree" = "#D70000","Neither agree nor disagree" = "#C0BEB8", "Somewhat agree" = "#008DCA", "Strongly agree" = "#00405B")) 
#fix grid
plot <- plot + guides(fill=guide_legend(title="29")) + theme_bw() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + theme(panel.border = element_blank()) + theme(axis.ticks = element_blank()) + theme(axis.title.y=element_blank()) + theme(axis.title.x=element_blank()) + theme(axis.text.x=element_blank()) + theme(text=element_text(size=19,  family="serif")) + theme(axis.text.y = element_text(color="black")) + theme(legend.position = "top") + theme(legend.text=element_text(size=12)) 
#plot graph
plot

这将创建此条形图:

现在我遇到的问题是尝试在这些条上添加百分比标签。我想添加以白色字母居中显示每个段的百分比的文本。

不幸的是,我在添加 geom_text 时遇到了一些问题,因为它经常给我错误,因为我没有 x 变量而且我不确定如何修复它,就像我的方式一样与我见过的同时使用 x 和 y 变量的其他方式相比,使用的填充有点奇怪。考虑到填充是每种响应类型的百分比(级别中显示的不同响应类型),我真的不知道我什至会为 x 变量添加什么。

如有任何帮助,我们将不胜感激!如果这很重要,很乐意回答有关数据集的任何问题。

这是两个相关列的示例(未使用 head,因为此数据集中的变量太多)。基本上,它们会显示受访者属于哪个党派,如果他们强烈同意,则有点同意,等等。

这是两个变量的 dput 输出:

structure(list(pid3lean = structure(c("Democrats", "Democrats", 
"Democrats", "Democrats", "Independents", "Democrats", "Republicans", 
"Independents", "Republicans", "Democrats", "Democrats", "Independents", 
"Democrats", "Republicans", "Democrats", "Democrats", "Democrats", 
"Democrats", "Democrats", "Republicans"), label = "pid3lean", format.spss = "A13", display_width = 15L), 
    Q29_1String = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 5L, 4L, 
    1L, 1L, 2L, 5L, 1L, 5L, 1L, 1L, 1L, 5L, 1L, 3L), .Label = c("Strongly agree", 
    "Somewhat agree", "Neither agree nor disagree", "Somewhat disagree", 
    "Strongly disagree"), class = "factor")), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))

要将百分比放在条形中间,请使用 position_fill(vjust = 0.5) 并计算 geom_text 中的比例。这些比例是占总值的比例,不是按bar计算的。

library(ggplot2)

colors <- c("#00405b", "#008dca", "#c0beb8", "#d70000", "#7d0000")
colors <- setNames(colors, levels(newDoto$Q29_1String))

ggplot(newDoto, aes(pid3lean, fill = Q29_1String)) +
  geom_bar(position = position_fill()) +
  geom_text(aes(label = paste0(..count../sum(..count..)*100, "%")),
            stat = "count",
            colour = "white",
            position = position_fill(vjust = 0.5)) +
  scale_fill_manual(values = colors) +
  coord_flip()


软件包 scales 具有自动格式化百分比的功能。

ggplot(newDoto, aes(pid3lean, fill = Q29_1String)) +
  geom_bar(position = position_fill()) +
  geom_text(aes(label = scales::percent(..count../sum(..count..))),
            stat = "count",
            colour = "white",
            position = position_fill(vjust = 0.5)) +
  scale_fill_manual(values = colors) +
  coord_flip()


编辑

要求按条计算比例之后,下面是一个仅首先计算基数 R 的比例的解决方案。

tbl <- xtabs(~ pid3lean + Q29_1String, newDoto)
proptbl <- proportions(tbl, margin = "pid3lean")
proptbl <- as.data.frame(proptbl)
proptbl <- proptbl[proptbl$Freq != 0, ]

ggplot(proptbl, aes(pid3lean, Freq, fill = Q29_1String)) +
  geom_col(position = position_fill()) +
  geom_text(aes(label = scales::percent(Freq)),
            colour = "white",
            position = position_fill(vjust = 0.5)) +
  scale_fill_manual(values = colors) +
  coord_flip() +
  guides(fill = guide_legend(title = "29")) +
  theme_question_70539767()


要添加到情节的主题

theme 中定义的主题的副本,略有改动。

theme_question_70539767 <- function(){
  theme_bw() %+replace%
    theme(panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.border = element_blank(),
          text = element_text(size = 19, family = "serif"),
          axis.ticks = element_blank(),
          axis.title.y = element_blank(),
          axis.title.x = element_blank(),
          axis.text.x = element_blank(),
          axis.text.y = element_text(color = "black"),
          legend.position = "top",
          legend.text = element_text(size = 10),
          legend.key.size = unit(1, "char")
    )
}

您首先需要使用 dplyr 包计算百分比:

library(dplyr)
newDoto <- newDoto %>% group_by(pid3lean) %>%
  count(Q29_1String) %>%
  mutate(perc = n/sum(n)) %>%
  select(-n)

使用现有代码,只需在代码末尾添加以下行:

plot + 
  geom_text(stat = 'count', aes(label = perc), position = position_fill(vjust = 0.5), size = 3, color = "white")

这是另一种方法:

  1. 这里我们在数据框中进行统计(计算百分比并将 class 更改为 Q29_1String
  2. 的因子
  3. 使用geom_col
  4. 然后使用coord_flip
  5. 调整主题部分
library(tidyverse)

df %>% 
  group_by(pid3lean) %>% 
  count(Q29_1String) %>% 
  ungroup() %>% 
  mutate(pct = n/sum(n)) %>% 
  mutate(Q29_1String = as.factor(Q29_1String)) %>% 
  ggplot(aes(x = pid3lean, y = pct, fill = Q29_1String)) +
  geom_col(position = "fill", width = .732) +
  scale_fill_manual(values = c("Strongly disagree" = "#7D0000", "Somewhat disagree" = "#D70000","Neither agree nor disagree" = "#C0BEB8", "Somewhat agree" = "#008DCA", "Strongly agree" = "#00405B")) +
  coord_flip()+
  geom_text(aes(label = scales::percent(pct)), 
            position = position_fill(vjust = 0.5),size=5, color="white",
            ) + guides(fill=guide_legend(title="29")) + 
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        panel.border = element_blank(), 
        axis.ticks = element_blank(), 
        axis.title.y=element_blank(), 
        axis.title.x=element_blank(), 
        axis.text.x=element_blank(), 
        text=element_text(size=19,  family="serif"), 
        axis.text.y = element_text(color="black"),
        legend.position = "top",
        legend.text=element_text(size=12)
        )