如何将方法与ggplot中单个类别中的一条线连接起来

how to connect the means with a line within a single category in ggplot

这是一个虚拟代码:

library(ggplot2)
library(dplyr)

diamonds |> dplyr::filter(color %in% c("D","E", "F"), cut %in% c("Ideal","Fair"), clarity %in% c("SI2","VS2","IF")) |> ggplot(aes(x = clarity, y =carat,  color=color, shape=cut)) +
stat_summary(fun.data= mean_cl_boot, geom="errorbar", width=0.05, position=position_dodge(0.7)) +
stat_summary(fun=mean, geom="point", size=2, position= position_dodge(0.7))

我想在每个净度类别中用一条线连接该方法(即将圆圈连接到三角形:以图片中的红色为例):

如果我使用 geom_statgeom_line:它给出一个错误 geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic? 这是有道理的,因为它们都在一个 clarity 组中。我尝试使用 group=interaction() 但它也不起作用,我只能针对不同 clarity 组中的点数使用

我觉得最好用手动闪避

library(ggplot2)
library(dplyr)

df <- diamonds %>% dplyr::filter(color %in% c("D","E", "F"), cut %in% c("Ideal","Fair"), clarity %in% c("SI2","VS2","IF")) 

## make a names vector for your manual dodge 
## this of course needs adjustment depending on your actual data. can be automated
dodge_vec <- seq(-.25, .25, length = 6)
names(dodge_vec) <- unique(with(df, paste(cut, color, sep = "_")))

## some data alterations - assign dodge by subsetting with named vector
df <- df %>%
  mutate(cut_col = dodge_vec[paste(cut, color, sep = "_")]) 
## summarise for your lines 
df_line <- 
  df %>%
  group_by(clarity, cut, color, cut_col) %>%
  summarise(mean_carat = mean(carat))
#> `summarise()` has grouped output by 'clarity', 'cut', 'color'. You can override
#> using the `.groups` argument.

## need to pass your original x as an integer and add your new doding column
ggplot(df, aes(x = as.integer(factor(clarity)) + cut_col, y =carat, color=color, shape=cut)) +
stat_summary(fun.data= mean_cl_boot, geom="errorbar", width=0.05) +
  stat_summary(fun=mean, geom="point", size=2) +
  ## add lines with your new data, using an interaction variable
  geom_line(data = df_line, aes(y = mean_carat, group = interaction( as.integer(clarity), color))) +
  scale_x_continuous(breaks = 1:3, labels = unique(df$clarity))
#> Warning: Using shapes for an ordinal variable is not advised

你的问题表明你正在处理配对数据,因此我在评论中提出了建议。我想举个例子,但是钻石数据集没有成对的数据,所以要伪造起来有点困难。

reprex package (v2.0.1)

创建于 2022-05-31