在 r 中按列添加平均线组

adding average line group by column in r

嗨,我有这样的数据:

总共 38 列。治疗列中有 10 种治疗类型,日期列数据示例代码中的日期为 25-29(示例为 2 种治疗类型,但数据有 10 种类型):

df <- structure(
    list(
      Christensenellaceae = c(
        0,
        0.009910731,
        0.010131195,
        0.009679938,
        0.01147601,
        0.010484508,
        0.008641566,
        0.010017172,
        0.010741488,
        0.1,
        0.2,
        0.3,
        0.4),
      Date=c(25,25,25,25,25,27,27,27,27,27,27,27,27),
      Treatment = c(
        "Original Sample",
        "Original Sample",
        "Original Sample",
        "Original Sample",
        "Original Sample"
        "Treatment 1",
        "Treatment 1",
        "Treatment 1",
        "Treatment 1",
        "Treatment 2",
        "Treatment 2",
        "Treatment 2",
        "Treatment  2")
    ),class = "data.frame",
    row.names = c(NA,-9L)
  )

我想做的是为每一列创建 2 个图,一个用于原始处理,另一个用于此处示例 (1-2) 中的所有处理类型 (1-10),并添加观察的平均线基于每种治疗类型。治疗区总共应该有 10 条平均线(这里是 2 条)。遗憾的是我不明白如何添加按治疗类型分组的行 这是我基于所有治疗类型的一行代码。我如何添加按治疗类型分组的行:

df_3 %>% 
  pivot_longer(-treatment) %>% 
  mutate(plot = ifelse(str_detect(treatment, "Original"), 
                       "Original sample", 
                       "Treatment"),
         treatment = str_extract(treatment, "\d+$")) %>% 
  group_by(name) %>% 
  group_split() %>% 
  map(~.x %>% ggplot(aes(x = factor(treatment), y = value, color = factor(name))) +
        geom_point() +
        stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
        +
        facet_wrap(~plot, scales = "free_x") +
        labs(x = "Treatment", y = "Value", color = "Taxa") +
        guides(x =  guide_axis(angle = 90))+
        theme_bw()) 

如您所见,只有一条平均线,每种治疗类型我需要 10 条(此处为 2 条)。有什么办法可以编辑我的代码使其正常工作吗?谢谢:)

我也试过这段代码,但我似乎没有用

      df %>% 
     pivot_longer(-c(Treatment, Date), names_to = "taxon") 
      %>% mutate( type = Treatment %>% str_detect("Original") 
      %>% ifelse("Original", "Treatment"), treatment_nr = Treatment 
       %>% str_extract("(?<=Treatment )[0-9]+") )
         %>% ggplot(aes(Date, value, color = treatment_nr)) + 
           geom_point() + stat_summary( geom = "point", fun.y = 
           "mean", size = 3, shape = 24 ) + geom_line() + facet_grid(type 
            ~ taxon, scales = "free_y") #> Warning: `fun.y` is deprecated. 
                Use `fun` instead. 

您的数据格式不正确,并且与您的原始示例代码不匹配(例如 Treatment 而不是 treatment)。无论如何,我将在这里生成一些数据,以便根据您的图像中的数据说明解决方案。

library(tidyverse)
set.seed(1)
df <-
  data.frame(
    Christensenellaceae = runif(105),
    treatment = rep(c("Original Sample_25", 
                      paste0("Treatment", 1:10, "_", 27), 
                      paste0("Treatment", 1:10, "_", 28)), 
                    each = 5)
  )

因为您将平均值生成为一条线,所以它将连接在 x 轴上。我做了一个非常懒惰的工作,使用一个片段并在绘图之前计算平均值。根据您的十次治疗的外观,您可以通过更改 avg_line_length.

来更改平均线的大小

因为该段还有额外的 x 轴值(例如 0.65、1.35),x 轴将默认包含这些额外值。我已经创建了标签和中断来解决这个问题,并且为此使用了中间数据 labs_df。我把原来的留空了。您可以使用 color/linetype 来在图例中也将行显示为 'Mean'。

avg_line_length <- 0.35

p <-
  df %>% 
    pivot_longer(-treatment) %>% 
    mutate(plot = ifelse(str_detect(treatment, "Original"), 
                         "Original sample", 
                         "Treatment"),
           treatment = as.numeric(str_extract(treatment, "\d+")),
           treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>% 
    {. ->> lab_df} %>%
    group_by(treatment) %>%
    mutate(avg = mean(value),
           xstart = treatment - avg_line_length,
           xend = treatment + avg_line_length) %>%
    ungroup() %>%
    group_by(name) %>%
    group_split() %>% 
    map(~.x %>% ggplot() +
          geom_point(aes(x = treatment, y = value, color = name)) +
          geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg, color = name)) +
          scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
          facet_wrap(~plot, scales = "free_x") +
          labs(x = "Treatment", y = "Value", color = "Taxa") +
          guides(x =  guide_axis(angle = 90))+
          theme_bw()) 

p
#> [[1]]

如果您不想要原始样本的平均线,只需额外 ifelse.

p2 <-
  df %>% 
    pivot_longer(-treatment) %>% 
    mutate(plot = ifelse(str_detect(treatment, "Original"), 
                         "Original sample", 
                         "Treatment"),
           treatment = as.numeric(str_extract(treatment, "\d+")),
           treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>% 
    {. ->> lab_df} %>%
    group_by(treatment) %>%
    mutate(avg = ifelse(plot %in% "Original sample", NA, mean(value)),
           xstart = treatment - avg_line_length,
           xend = treatment + avg_line_length) %>%
    ungroup() %>%
    group_by(name) %>%
    group_split() %>% 
    map(~.x %>% ggplot() +
          geom_point(aes(x = treatment, y = value, color = factor(name))) +
          geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg), colour="red") +
          scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
          facet_wrap(~plot, scales = "free_x") +
          labs(x = "Treatment", y = "Value", color = "Taxa") +
          guides(x =  guide_axis(angle = 90))+
          theme_bw()) 

p2
#> [[1]]
#> Warning: Removed 5 rows containing missing values (geom_segment).

内容很乱,但希望能解决您的问题。