显示 continuous/ordered 变量类别内的百分比(使用 ggplot)

Displaying percentages within category for continuous/ordered variable (with ggplot)

我有两个问题,第一个(希望)是简单的机械问题,第二个是理论性的(尽管仍然包含技术元素)。

  1. 我正在尝试做一些与 几乎相同的事情,但我有一个变量 ordered/continuous (0 - 4),而不是 1/0 二分法变量,这意味着过滤 == 1 将不起作用。在这里总结一下,我只想显示每个种族类别中每个级别的百分比。

  2. 我也希望找到一种方法,在一张图中显示所有 3 个问题的描述性结果。我首先考虑尝试对每个变量(问题 1、问题 2、问题 3)进行某种类型的 facet_wrap() 作为其自己的面板。是否需要 pivot_longer() 来使我的数据变长而不是变宽?另一个想法是只有一个 figure/panel,但每个 x 轴刻度是一个种族类别而不是一个问题,然后 3 个问题中的每一个都有 3 个条。不过,我不确定如何让它发挥作用。

提前致谢,对于这个冗长的问题深表歉意。这是一些示例数据:

set.seed(123)

d <- data.frame(
  race = sample(c("White", "Hispanic", "Black", "Other"), 100, replace = TRUE),
  question1 = sample(0:4, 100, replace = TRUE),
  question2 = sample(0:4, 100, replace = TRUE),
  question3 = sample(0:4, 100, replace = TRUE)
)

这个怎么样:

  library(tidyverse)
set.seed(123)
d <- data.frame(
  race = sample(c("White", "Hispanic", "Black", "Other"), 100, replace = TRUE),
  question1 = sample(0:4, 100, replace = TRUE),
  question2 = sample(0:4, 100, replace = TRUE),
  question3 = sample(0:4, 100, replace = TRUE)
)

d %>%
  pivot_longer(-race, names_to = "question", values_to = "vals") %>% 
  group_by(question, race, vals) %>% 
  tally() %>% 
  group_by(question, race) %>% 
  mutate(pct = n/sum(n)) %>% 
  ggplot(aes(x=race, y=pct, fill=as.factor(vals))) + 
  geom_bar(position="stack", stat="identity") + 
  facet_wrap(~question) + 
  scale_y_continuous(labels = scales:::percent) + 
  labs(x="", y="Percentage (within Race)", fill="Response") + 
  theme_bw() + 
  theme(legend.position = "top", 
        panel.grid = element_blank(), 
        axis.text.x = element_text(angle = 45, hjust=1))

reprex package (v2.0.1)

于 2022-05-23 创建

这里修改一下@DaveArmstrong的非常好的解决方案+1:

library(tidyverse)
library(RColorBrewer)

sPalette <- "Purples" %>% 
  sapply(., function(x) brewer.pal(8, name = x)) %>% 
  as.vector

d %>% 
  pivot_longer(-race) %>%
  count(name, race, value) %>% 
  group_by(name, race) %>%
  mutate(value = as.factor(value),
         pct= prop.table(n) * 100) %>% 
  ggplot(aes(x=race, y=pct, fill=value)) + 
  geom_col(position = position_fill()) +
  facet_wrap(.~name)+
  labs(x="", y="Percentage (within Race)", fill="Response") + 
  scale_y_continuous(labels = scales::percent) +
  geom_text(aes(label =  round(pct, 1)),
            position = position_fill(vjust = .5)) +
  scale_fill_manual(values = sPalette) + 
  theme_classic()+
  theme(legend.position = "top", 
        panel.grid = element_blank(), 
        axis.text.x = element_text(angle = 45, hjust=1))