在 ggplot2 中按组向直方图添加均值
Add means to histograms by group in ggplot2
我正在关注 this source 在 ggplot2
中按组绘制直方图。
示例数据如下所示:
set.seed(3)
x1 <- rnorm(500)
x2 <- rnorm(500, mean = 3)
x <- c(x1, x2)
group <- c(rep("G1", 500), rep("G2", 500))
df <- data.frame(x, group = group)
代码:
# install.packages("ggplot2")
library(ggplot2)
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity")
我知道添加一行:
+geom_vline(aes(xintercept=mean(group),color=group,fill=group), col = "red")
应该让我得到我正在寻找的东西,但我得到的只是一个具有一个均值的直方图,而不是按组的均值:
你有什么建议吗?
我会计算数据帧的平均值:
library(ggplot2)
library(dplyr)
df %>%
group_by(group) %>%
mutate(mean_x = mean(x))
输出为:
# A tibble: 1,000 × 3
# Groups: group [2]
x group mean_x
<dbl> <chr> <dbl>
1 -0.962 G1 0.0525
2 -0.293 G1 0.0525
3 0.259 G1 0.0525
4 -1.15 G1 0.0525
5 0.196 G1 0.0525
6 0.0301 G1 0.0525
7 0.0854 G1 0.0525
8 1.12 G1 0.0525
9 -1.22 G1 0.0525
10 1.27 G1 0.0525
# … with 990 more rows
也一样:
library(ggplot2)
library(dplyr)
df %>%
group_by(group) %>%
mutate(mean_x = mean(x)) %>%
ggplot(aes(x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity") +
geom_vline(aes(xintercept = mean_x), col = "red")
输出为:
除了前面的建议,你还可以使用单独存储的分组方式,i. e.两个而不是 nrow=1000 个高度冗余的值:
## a 'tidy' (of several valid ways for groupwise calculation):
group_means <- df %>%
group_by(group) %>%
summarise(group_means = mean(x, na.rm = TRUE)) %>%
pull(group_means)
## ... ggplot code ... +
geom_vline(xintercept = group_means)
没有预计算的直接方法是:
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity") +
geom_vline(xintercept = tapply(df$x, df$group, mean), col = "red")
我正在关注 this source 在 ggplot2
中按组绘制直方图。
示例数据如下所示:
set.seed(3)
x1 <- rnorm(500)
x2 <- rnorm(500, mean = 3)
x <- c(x1, x2)
group <- c(rep("G1", 500), rep("G2", 500))
df <- data.frame(x, group = group)
代码:
# install.packages("ggplot2")
library(ggplot2)
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity")
我知道添加一行:
+geom_vline(aes(xintercept=mean(group),color=group,fill=group), col = "red")
应该让我得到我正在寻找的东西,但我得到的只是一个具有一个均值的直方图,而不是按组的均值:
你有什么建议吗?
我会计算数据帧的平均值:
library(ggplot2)
library(dplyr)
df %>%
group_by(group) %>%
mutate(mean_x = mean(x))
输出为:
# A tibble: 1,000 × 3
# Groups: group [2]
x group mean_x
<dbl> <chr> <dbl>
1 -0.962 G1 0.0525
2 -0.293 G1 0.0525
3 0.259 G1 0.0525
4 -1.15 G1 0.0525
5 0.196 G1 0.0525
6 0.0301 G1 0.0525
7 0.0854 G1 0.0525
8 1.12 G1 0.0525
9 -1.22 G1 0.0525
10 1.27 G1 0.0525
# … with 990 more rows
也一样:
library(ggplot2)
library(dplyr)
df %>%
group_by(group) %>%
mutate(mean_x = mean(x)) %>%
ggplot(aes(x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity") +
geom_vline(aes(xintercept = mean_x), col = "red")
输出为:
除了前面的建议,你还可以使用单独存储的分组方式,i. e.两个而不是 nrow=1000 个高度冗余的值:
## a 'tidy' (of several valid ways for groupwise calculation):
group_means <- df %>%
group_by(group) %>%
summarise(group_means = mean(x, na.rm = TRUE)) %>%
pull(group_means)
## ... ggplot code ... +
geom_vline(xintercept = group_means)
没有预计算的直接方法是:
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity") +
geom_vline(xintercept = tapply(df$x, df$group, mean), col = "red")