ggplot 在一个图中绘制两个直方图

Question

我创建了下面的情节：

ggplot(data_all, aes(x = data_all$Speed, fill = data_all$Season)) + 
  theme_bw() +
  geom_histogram(position = "identity", alpha = 0.2, binwidth=0.1)

如您所见，可用数据量的差异非常大。有没有办法只看分布而不看数据总量？

Answer 1

您可以使用以前可能见过的表示法从 stat 函数引用其他一些计算值：..value..。我不确定这些的正确名称或在哪里可以找到记录的列表，但有时这些被称为“特殊变量”或“计算美学”。

在这种情况下，geom_histogram() 的 y 轴上的默认计算美学是 ..count..。比较不同总 N 大小的分布时，使用 ..density.. 很有用。您可以通过直接在 geom_histogram() 函数中将其传递给 y 美学来访问 ..density..。

首先，这是两个大小差异很大的直方图的示例（类似于 OP 的问题）：

library(ggplot2)

set.seed(8675309)
df <- data.frame(
 x = c(rnorm(1000, -1, 0.5), rnorm(100000, 3, 1)),
 group = c(rep("A", 1000), rep("B", 100000))
)

ggplot(df, aes(x, fill=group)) + theme_classic() +
  geom_histogram(
    alpha=0.2, color='gray80',
    position="identity", bins=80)

这是使用 ..density..:

的相同图

ggplot(df, aes(x, fill=group)) + theme_classic() +
  geom_histogram(
    aes(y=..density..), alpha=0.2, color='gray80',
    position="identity", bins=80)

ggplot 在一个图中绘制两个直方图

ggplot two histograms in one plot

r

histogram

ggplot2