将总观察数 (n) 放在 ggplot 中堆叠百分比条形图的顶部
put total observation number (n) on top of stacked percentage barplot in ggplot
我在 ggplot 中有一个堆叠百分比条形图,我想将总观察数放在堆叠条形图的顶部(同时保持堆叠条形图为百分比)。然而,我一直 运行 陷入困境。
下面是我生成百分比条形图的代码:
# sample dataset
set.seed(123)
cat1<-sample(letters[1:3], 500, replace=T, prob=c(0.1, 0.2, 0.65))
cat2<-sample(letters[4:8], 500, replace=T, prob=c(0.3, 0.4, 0.75, 0.5, 0.1))
df <- data.frame(cat1, cat2)
# the barplot
ggplot(df, aes(x=cat1))+
geom_bar(aes(fill = cat2),
position = 'fill',color = "black")+
scale_y_continuous(labels = scales::percent)+
labs ( y = "Percentage")+
# this final line is me trying to add the label
geom_text(aes(label=cat1))
# this is the observation number I want display
table(df$cat1)
# but I get this error:
Error: geom_text requires the following missing aesthetics: y
所以我有两个问题:
- 如何将每个 cat1 "N=" 标签的总观察数放在每个堆叠条的顶部)?
- 我的代码中条形图的 y 到底是什么 (aes(x=...))?我有 x,但没有 y,但情节似乎有效..
谢谢!
你可以试试
temp <- data.frame(x=c("a", "b", "c"), y=c(1.02, 1.02, 1.02), z=c(51, 101, 348))
ggplot(df, aes(x=cat1))+
geom_bar(aes(fill = cat2),
position = 'fill',color = "black")+
scale_y_continuous(labels = scales::percent)+
labs ( y = "Percentage")+
# this final line is me trying to add the label
geom_text(data=temp, aes(x=x, y=y, label=as.factor(z)))
如果您不想对摘要标签进行硬编码,可以使用 dplyr
来计算百分比和设置标签格式,这是一种略有不同的方法(但仍然有点 hack)。
我还颠倒了您的图例以匹配图表上的顺序:)
library(dplyr)
df2 <- df %>%
group_by(cat1, cat2) %>%
summarise(n=n())%>%
mutate(percent = (n / sum(n)), cumsum = cumsum(percent), label=ifelse(cat2=="h", paste0("N=", sum(n)),""))
ggplot(df2,aes(x=cat1, y=percent, fill=cat2)) +
scale_y_continuous(labels = scales::percent) +
labs ( y = "Percentage") +
geom_bar(position = 'fill',color = "black", stat="identity") +
geom_text(aes(y=cumsum, label=label), vjust=-1) +
guides(fill=guide_legend(reverse=T))
我在 ggplot 中有一个堆叠百分比条形图,我想将总观察数放在堆叠条形图的顶部(同时保持堆叠条形图为百分比)。然而,我一直 运行 陷入困境。
下面是我生成百分比条形图的代码:
# sample dataset
set.seed(123)
cat1<-sample(letters[1:3], 500, replace=T, prob=c(0.1, 0.2, 0.65))
cat2<-sample(letters[4:8], 500, replace=T, prob=c(0.3, 0.4, 0.75, 0.5, 0.1))
df <- data.frame(cat1, cat2)
# the barplot
ggplot(df, aes(x=cat1))+
geom_bar(aes(fill = cat2),
position = 'fill',color = "black")+
scale_y_continuous(labels = scales::percent)+
labs ( y = "Percentage")+
# this final line is me trying to add the label
geom_text(aes(label=cat1))
# this is the observation number I want display
table(df$cat1)
# but I get this error:
Error: geom_text requires the following missing aesthetics: y
所以我有两个问题:
- 如何将每个 cat1 "N=" 标签的总观察数放在每个堆叠条的顶部)?
- 我的代码中条形图的 y 到底是什么 (aes(x=...))?我有 x,但没有 y,但情节似乎有效..
谢谢!
你可以试试
temp <- data.frame(x=c("a", "b", "c"), y=c(1.02, 1.02, 1.02), z=c(51, 101, 348))
ggplot(df, aes(x=cat1))+
geom_bar(aes(fill = cat2),
position = 'fill',color = "black")+
scale_y_continuous(labels = scales::percent)+
labs ( y = "Percentage")+
# this final line is me trying to add the label
geom_text(data=temp, aes(x=x, y=y, label=as.factor(z)))
如果您不想对摘要标签进行硬编码,可以使用 dplyr
来计算百分比和设置标签格式,这是一种略有不同的方法(但仍然有点 hack)。
我还颠倒了您的图例以匹配图表上的顺序:)
library(dplyr)
df2 <- df %>%
group_by(cat1, cat2) %>%
summarise(n=n())%>%
mutate(percent = (n / sum(n)), cumsum = cumsum(percent), label=ifelse(cat2=="h", paste0("N=", sum(n)),""))
ggplot(df2,aes(x=cat1, y=percent, fill=cat2)) +
scale_y_continuous(labels = scales::percent) +
labs ( y = "Percentage") +
geom_bar(position = 'fill',color = "black", stat="identity") +
geom_text(aes(y=cumsum, label=label), vjust=-1) +
guides(fill=guide_legend(reverse=T))