在 ggplot 2 中为两个离散变量创建百分比标签
Create percentage labels for two discrete variables in ggplot 2
下面是一些示例数据:
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- c(gender, outcome)
我想创建一个 ggplot,其中 y 轴是百分比,x 轴是性别,填充是结果。它必须是一个带有百分比标签的堆积条。
在这里试过这段代码:
ggplot (df, aes (x = gender, fill = outcome)) + geom_bar()
但这给了我 y 轴上的计数。我希望在 y 轴上创建百分比。堆叠的女性条形图必须表明具有“女性组中响应者和非响应者结果”的女性百分比,而不是响应或不响应的女性占总人口的百分比。例如,我希望看到 40% 的女性响应者与 60% 的无响应者以及类似的男性响应者。
为了准备发布,我还需要在堆积条中添加这些百分比的标签。
这里是标签:
library(ggplot2)
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
ggplot(df, aes(x= gender)) +
geom_bar(aes(y = 2*(..count..)/sum(..count..), fill = outcome, group=outcome), stat="count") +
geom_label(aes(label = scales::percent(2*(..count..)/sum(..count..)),
group = outcome), position = "fill", stat= "count", vjust = 0) +
labs(y = "Percent", fill="outcome") +
scale_y_continuous(labels = scales::percent)
@Paul 似乎有更好的方法 geom_bar
。
编辑
这是一个通用的解决方案:
library(ggplot2)
gender <- c("female", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
gg <- ggplot() +
geom_bar(aes(x= gender, fill = outcome), data = df, position = "fill")
ggb <- ggplot_build(gg)
df2 <- data.frame(y = ggb$data[[1]][["y"]])
gg + geom_label(
aes(x = rep(c(1,2), each = 2), label = scales::percent(y), y = y),
data = df2
)
不必更改数据的技巧是使用 geom_bar(position = "fill")
,如此处所述:。
要格式化 y 轴的标签,您有多种选择。这是其中两个:
- 使用
scales
包 scales::percent_format()
- 改用自定义函数,只需将上面的代码替换为
function(x) paste0(x*100, "%")
这里是:
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
library(ggplot2)
ggplot(data = df, aes(x = gender, fill = outcome)) +
geom_bar(position="fill") +
scale_y_continuous(labels = function(x) paste0(x*100, "%"))
由 reprex package (v2.0.0)
于 2021-08-19 创建
设法找到了 Paul 和 Stéphane 发布的答案的替代工作答案(他们也都很棒)。这种方法的优点是通用,在创建很多plot的时候可以节省时间。
library(dplyr)
library(ggplot2)
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
df %>%
group_by(gender, outcome) %>%
summarise(count = n()) %>%
mutate(pct = round(count/sum(count), 2)) %>%
ggplot(aes(x = factor(gender), y = pct, fill = factor(outcome))) +
geom_bar(stat="identity", width = 0.7) + scale_y_continuous(labels = scales::percent_format()) +
labs(x = "Sex", y = "Percentage", fill = "Outcome") +
theme_minimal(base_size = 14) +
geom_text(aes(label=paste0(pct*100, "%")), vjust=-0.25, position=position_stack(0.5))
这是输出
下面是一些示例数据:
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- c(gender, outcome)
我想创建一个 ggplot,其中 y 轴是百分比,x 轴是性别,填充是结果。它必须是一个带有百分比标签的堆积条。
在这里试过这段代码:
ggplot (df, aes (x = gender, fill = outcome)) + geom_bar()
但这给了我 y 轴上的计数。我希望在 y 轴上创建百分比。堆叠的女性条形图必须表明具有“女性组中响应者和非响应者结果”的女性百分比,而不是响应或不响应的女性占总人口的百分比。例如,我希望看到 40% 的女性响应者与 60% 的无响应者以及类似的男性响应者。
为了准备发布,我还需要在堆积条中添加这些百分比的标签。
这里是标签:
library(ggplot2)
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
ggplot(df, aes(x= gender)) +
geom_bar(aes(y = 2*(..count..)/sum(..count..), fill = outcome, group=outcome), stat="count") +
geom_label(aes(label = scales::percent(2*(..count..)/sum(..count..)),
group = outcome), position = "fill", stat= "count", vjust = 0) +
labs(y = "Percent", fill="outcome") +
scale_y_continuous(labels = scales::percent)
@Paul 似乎有更好的方法 geom_bar
。
编辑
这是一个通用的解决方案:
library(ggplot2)
gender <- c("female", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
gg <- ggplot() +
geom_bar(aes(x= gender, fill = outcome), data = df, position = "fill")
ggb <- ggplot_build(gg)
df2 <- data.frame(y = ggb$data[[1]][["y"]])
gg + geom_label(
aes(x = rep(c(1,2), each = 2), label = scales::percent(y), y = y),
data = df2
)
不必更改数据的技巧是使用 geom_bar(position = "fill")
,如此处所述:。
要格式化 y 轴的标签,您有多种选择。这是其中两个:
- 使用
scales
包scales::percent_format()
- 改用自定义函数,只需将上面的代码替换为
function(x) paste0(x*100, "%")
这里是:
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
library(ggplot2)
ggplot(data = df, aes(x = gender, fill = outcome)) +
geom_bar(position="fill") +
scale_y_continuous(labels = function(x) paste0(x*100, "%"))
由 reprex package (v2.0.0)
于 2021-08-19 创建设法找到了 Paul 和 Stéphane 发布的答案的替代工作答案(他们也都很棒)。这种方法的优点是通用,在创建很多plot的时候可以节省时间。
library(dplyr)
library(ggplot2)
gender <- c("male", "female", "male", "male", "female", "female", "male", "female", "female", "male")
outcome <- factor(c(0,0,0,1,1,1,0,1,1,1), levels = c(0,1), labels = c("responders", "non-responders"))
df <- data.frame(gender, outcome)
df %>%
group_by(gender, outcome) %>%
summarise(count = n()) %>%
mutate(pct = round(count/sum(count), 2)) %>%
ggplot(aes(x = factor(gender), y = pct, fill = factor(outcome))) +
geom_bar(stat="identity", width = 0.7) + scale_y_continuous(labels = scales::percent_format()) +
labs(x = "Sex", y = "Percentage", fill = "Outcome") +
theme_minimal(base_size = 14) +
geom_text(aes(label=paste0(pct*100, "%")), vjust=-0.25, position=position_stack(0.5))
这是输出