为每个方面添加带有人口中位数的 hline

Question

我想用该方面的人口中位数绘制一条水平面宽线。

我尝试了这种方法，但没有使用以下代码创建虚拟摘要 table：

require(ggplot2)

dt = data.frame(gr = rep(1:2, each = 500),
            id = rep(1:5, 2, each = 100), 
            y = c(rnorm(500, mean = 0, sd = 1), rnorm(500, mean = 1, sd = 2)))


ggplot(dt, aes(x = as.factor(id), y = y)) +
  geom_boxplot() +
  facet_wrap(~ gr) +
  geom_hline(aes(yintercept = median(y), group = gr), colour = 'red')

但是，这条线是针对整个数据集的中值绘制的，而不是针对每个方面分别绘制的中值：

过去有人建议solution使用

  geom_line(stat = "hline", yintercept = "median")

但它已停产（产生错误 "No stat called StatHline"）。

另一个 solution 建议

 geom_errorbar(aes(ymax=..y.., ymin=..y.., y = mean))

但它生成

Error in data.frame(y = function (x, ...)  : 
arguments imply differing number of rows: 0, 1000

最后，有一种方法可以通过创建具有所需统计数据的 dummy table 来绘制中位数，但我想避免它。

Answer 1

您可以在 dt 中为每个方面的中位数创建一个额外的列。

library(dplyr) # With dplyr for example
dt <- dt %>% group_by(gr) %>%
  mutate(med = median(y))

# Rerun ggplot line with yintercept = med
ggplot(dt, aes(x = as.factor(id), y = y)) +
  geom_boxplot() +
  facet_wrap(~ gr) +
  geom_hline(aes(yintercept = med, group = gr), colour = 'red')

Answer 2

如果您不想使用计算出的中位数添加新列，您可以使用分位数回归添加 geom_smooth：

library(ggplot2)
library(quantreg)

set.seed(1234)

dt <- data.frame(gr = rep(1:2, each = 500),
                id = rep(1:5, 2, each = 100), 
                y = c(rnorm(500, mean = 0, sd = 1),
                      rnorm(500, mean = 1, sd = 2)))

ggplot(dt, aes(y = y)) +
  geom_boxplot(aes(x = as.factor(id))) +
  geom_smooth(aes(x = id), method = "rq", formula = y ~ 1, se = FALSE) +
  facet_wrap(~ gr)

为每个方面添加带有人口中位数的 hline

Add hline with population median for each facet

r

facet

median

ggplot2