将带有美元符号 ($) 的变量与 facet_grid() 或 facet_wrap() 组合传递给 aes() 时出现问题

Question

我目前正在 ggplot2 中为一个项目做一些分析，偶然发现了一些（对我来说）我无法解释的奇怪行为。当我写 aes(x = cyl, ...) 时，情节看起来与我使用 aes(x = mtcars$cyl, ...) 传递相同变量时的情节不同。当我删除 facet_grid(am ~ .) 时，两个图再次相同。下面的代码是根据我的项目中生成相同行为的代码建模的：

library(dplyr)
library(ggplot2)

data = mtcars

test.data = data %>%
  select(-hp)

ggplot(test.data, aes(x = test.data$cyl, y = mpg)) +
  geom_point() + 
  facet_grid(am ~ .) +
  labs(title="graph 1 - dollar sign notation")

ggplot(test.data, aes(x = cyl, y = mpg)) +
  geom_point()+ 
  facet_grid(am ~ .) +
  labs(title="graph 2 - no dollar sign notation")

这是图1的图片：

这是图2的图片：

我发现我可以使用 aes_string 而不是 aes 并将变量名称作为字符串传递来解决这个问题，但我想了解为什么 ggplot 会这样。在 facet_wrap.

的类似尝试中也会出现此问题

Answer 1

tl;dr

从不在aes().

中使用[或$

考虑这个说明性示例，其中分面变量 f 相对于 x

故意采用非显而易见的顺序

d <- data.frame(x=1:10, f=rev(letters[gl(2,5)]))

现在对比一下这两个图发生了什么，

p1 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(x, y=0, label=x, colour=f)) +
  ggtitle("good mapping") 

p2 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(d$x, y=0, label=x, colour=f)) +
  ggtitle("$ corruption")

我们可以通过查看由 ggplot2 为每个面板在内部创建的 data.frame 来更好地了解正在发生的事情，

 ggplot_build(p1)[["data"]][[1]][,c("x","PANEL")]

    x PANEL
1   6     1
2   7     1
3   8     1
4   9     1
5  10     1
6   1     2
7   2     2
8   3     2
9   4     2
10  5     2

 ggplot_build(p2)[["data"]][[1]][,c("x", "PANEL")]

    x PANEL
1   1     1
2   2     1
3   3     1
4   4     1
5   5     1
6   6     2
7   7     2
8   8     2
9   9     2
10 10     2

第二个图有错误的映射，因为当 ggplot 为每个面板创建 data.frame 时，它会按 "wrong" 顺序选择 x 值。

这是因为 $ 的使用打破了要映射的各种变量之间的 link（ggplot 必须假设它是一个自变量，它知道它可能来自任意变量，断开源）。由于此示例中的 data.frame 未根据因素 f 排序，因此每个面板内部使用的子集 data.frame 假定顺序错误。

将带有美元符号 ($) 的变量与 facet_grid() 或 facet_wrap() 组合传递给 aes() 时出现问题

Issue when passing variable with dollar sign notation ($) to aes() in combination with facet_grid() or facet_wrap()

r

ggplot2

r-faq