使用两个数据集时,有没有办法让 ggplot2 中的 position_jitterdodge 按两个分类变量排序?

Is there a way to get position_jitterdodge in ggplot2 to sort by two categorical variables when using two data sets?

我正在尝试根据具有一个连续变量和两个分类变量的数据集制作一个抖动的箱线图。我已将数据分成两个数据框,因为我不想将数据框 2 的方法包含在箱线图中,并且因为我想以不同的颜色显示它们。我使用 position_jitterdodge() 以便这些点位于与其变量“b”类别相对应的箱形图的顶部,但它似乎不起作用。例如,所有红点都应位于紫色箱线图的中心,但它们似乎分布在整个图表中。我没有足够的声誉来包含图片,但我已经在 R 社区页面上发布了这个问题,其中包含一张图片,如果它有帮助的话 https://community.rstudio.com/t/position-jitterdodge-doesnt-plot-points-by-categorical-variable/137199。如果有人知道如何解决这个问题,我将不胜感激。

#Jittered boxplot reprex

#Load packages
library(reprex)
#> Warning: package 'reprex' was built under R version 4.0.5
library(ggplot2)

#Create dataframes

df.1 <- tibble::tribble(
        ~a, ~b, ~c,
        1, "x", "n",
        2, "x", "n",
        2, "x", "m",
        3, "x", "m",
        3, "y", "n",
        1, "y", "n",
        1, "y", "m",
        2, "y", "m",
        2, "z", "n",
        3, "z", "n",
        3, "z", "m",
        1, "z", "m"
)

df.2 <- tibble::tribble(
        ~a, ~b, ~c,
        2, "z", "n",
        3, "z", "n",
        1, "z", "n",
        3, "z", "n",
        2, "z", "m",
        1, "z", "m",
        2, "z", "m",
        3, "z", "m",
)

#Make a box plot with data from df.1, with data from df.2 overlaid as a jitterplot

plot.1 <- ggplot(data = df.1, aes(x=c, y=a, fill=b)) + 
  geom_boxplot()+
  geom_jitter(data = df.2, aes(x=c, y=a), color = "red", position=position_jitterdodge(), size=2) +
  geom_jitter(position=position_jitterdodge(), size=2)+
  scale_fill_manual("b", values=c("#FFD54F","#B2EBF2","#7B1FA2"))
plot(plot.1)

reprex package (v2.0.1)

于 2022-05-18 创建

我认为问题在于 df.2 字段 b 中的因子水平不完整。我建议将您的数据放在一起,但以长格式构建(这是 ggplot 更喜欢的格式)。

df.1$grp <- "1"
df.2$grp <- "2"
df <- rbind(df.1, df.2)

ggplot(data = df, aes(x=c, y=a, fill=b, color=grp)) + 
  geom_boxplot(data = df[df$grp == 1,]) +
  geom_jitter(position = position_jitterdodge(0.05)) +
  scale_fill_manual("b", values=c("#FFD54F","#B2EBF2","#7B1FA2")) +
  scale_color_manual(guide = "none", values = c("1" = "black", "2" = "red"))

编辑

在不抖动的情况下向分组箱线图添加点。请注意,这里有几个点是重叠的。

ggplot(data = df, aes(x=c, y=a, fill=b, color=grp)) + 
  geom_boxplot(data = df[df$grp == 1,], outlier.shape = NA) +
  geom_point(aes(group=b), position = position_dodge(0.75)) +
  scale_fill_manual("b", values=c("#FFD54F","#B2EBF2","#7B1FA2")) +
  scale_color_manual(guide = "none", values = c("1" = "black", "2" = "red"))