当 aes(group = ...) 而不是 aes(fill/shape = ...) 时，闪避误差条的位置错误

Question

用 position = "dodge" 绘制误差线最近让我很头疼...奇怪的是，用美学 shape 或 fill 来避开它们（误差线不应该适用）似乎运作良好。然而，以美学方式躲避 group 会将条形图置于意想不到的位置。我想知道这是否是 ggplot2 错误。

我喜欢在条形图或箱线图后面放置自定义误差线。有时我会为情节的不同元素赋予特殊的颜色。出于这个原因，我经常不在 ggplot() 函数中包含 aes()，而是在 geoms 或 stats 中。

这里有一个 "well placed" 误差线的例子：

library(ggplot2)
library(dplyr)

ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(fill = supp), geom = "errorbar", position = "dodge") + 
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0)

这会产生警告 Warning: Ignoring unknown aesthetics: fill。使用 aes(shape = supp) 打印相同的图。

我希望是相同的情节，但是通过将 fill/shape 与 "group" (aes(group = supp)) 交换不会出现警告。这不会产生警告，但会产生非常意外的结果：

ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(group = supp), geom = "errorbar", position = "dodge") + 
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0)

有人可以解释这种行为吗？ aes(group = ...) 和 aes(fill = ...) 分组在闪避位置上的行为不应该相似吗？

Answer 1

代码忽略未知美学：填充

stat_boxplot(aes(fill = supp), geom = "errorbar", position = "dodge")

而代码考虑了美学 group = supp 并为 OJ 和 VC 分别给出了两个误差线。

stat_boxplot(aes(group = supp), geom = "errorbar", position = "dodge")

完整代码

library(ggplot2)
library(dplyr)

ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(fill = supp), geom = "errorbar", position = "dodge") +
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0) 

Warning: Ignoring unknown aesthetics: fill



ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(group = supp), geom = "errorbar", position = "dodge") + 
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0)

Answer 2

来自 ?aes_group_order（已强调）：

By default, the group is set to the interaction of all discrete variables in the plot. This often partitions the data correctly, but when it does not, or when no discrete variable is used in the plot, you will need to explicitly define the grouping structure, by mapping group to a variable that has a different value for each group.

与

ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(fill = supp), geom = "errorbar", position = "dodge")

误差条的组自动设置为剂量（已转换为一个因子，即离散变量）和 supp（已经是 ToothGrowth 数据集中的一个因子）的交互作用。换句话说，每个剂量c(0.5, 1, 1.5)和补充c("OJ", "VJ")的组合被视为一个单独的组，目的是计算箱线图汇总统计。因此，显示的误差线与箱线图层完美匹配，即使填充不是 geom_errorbar.

的相关美学

有

ToothGrowth %>% 
  mutate(dose = factor(dose)) %>% 
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(group = supp), geom = "errorbar", position = "dodge")

误差线组明确设置为 supp 和 only supp。这会覆盖默认行为，因此我们只有两个而不是上面的 6 个组（一个用于 "OJ"，一个用于 "VJ"）。这导致误差线层和箱线图层不匹配。

您可以显式设置组映射以模仿默认行为：

p1 <- ToothGrowth %>%
  mutate(dose = factor(dose)) %>%
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(group = interaction(dose, supp)), geom = "errorbar", position = "dodge") +
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0)
p1
layer_data(p1, 1L) # view data associated with error bar layer
layer_data(p1, 2L) # view data associated with boxplot layer

p2 <- ToothGrowth %>%
  mutate(dose = factor(dose)) %>%
  ggplot(aes(dose, len)) +
  stat_boxplot(aes(group = interaction(supp, dose)), geom = "errorbar", position = "dodge")+
  geom_boxplot(aes(fill = supp), position = "dodge", coef = 0)
p2
layer_data(p2, 1L) # view data associated with error bar layer
layer_data(p2, 2L) # view data associated with boxplot layer

注意： interaction(dose, supp) 和 interaction(supp, dose) 将产生相同的图，appearance-wise，但如果您想比较基础数据与每一层相关联，interaction(dose, supp) 以与默认相同的顺序生成组，而 interaction(supp, dose) 则不会。

当 aes(group = ...) 而不是 aes(fill/shape = ...) 时，闪避误差条的位置错误

Wrong position of dodged error bars when aes(group = ...) but not aes(fill/shape = ...)

r

ggplot2

aesthetics

errorbar