如何对多个分组变量使用 stat_compare_means()

Question

我对编程还比较陌生，所以请多多包涵。我正在分析一些生物学数据，其中老鼠的体重已经测量了几周。小鼠被分配到 3 个治疗组：A、B 和 C。A 是参考组。我想要的是显示每周体重统计比较的图表，最好为每组使用不同的显着性符号。从数据中可以清楚地看出，B 组的体重在 12 周后恢复正常，而 C 组的体重仍然很高。我希望能够在图上看到统计变化。希望这是有道理的。出于某种原因，我就是想不通。这是我的代码，以及情节的样子

BW %>% 
  gather(Week, Weight, "0":"24") %>% 
  group_by(Group, Week) %>% 
  ggline(x = "Week", y = "Weight", add = "mean_se", color = "Group") +
  stat_compare_means(aes(group = Group), label = "p.signif", method = "anova")

Answer 1

library(tidyverse)
library(ggpubr)
library(broom)

# create example data
set.seed(1337)

base_values <- tribble(
  ~Group, ~Week, ~base_value,
  "A", 0, 20,
  "A", 12, 30,
  "A", 16, 32,
  "B", 0, 21,
  "B", 12, 35,
  "B", 16, 32,
  "C", 0, 20,
  "C", 12, 38,
  "C", 16, 40
)

data <-
  tibble(Group = LETTERS[1:3]) %>%
  expand_grid(Week = c(0, 12, 16)) %>%
  left_join(base_values) %>%
  mutate(Weight = base_value %>% map(~ runif(n = 10, min = 0.8, max = 1.2) * .x)) %>%
  select(-base_value) %>%
  unnest(Weight)
#> Joining, by = c("Group", "Week")
data
#> # A tibble: 90 x 3
#>    Group  Week Weight
#>    <chr> <dbl>  <dbl>
#>  1 A         0   20.6
#>  2 A         0   20.5
#>  3 A         0   16.6
#>  4 A         0   19.6
#>  5 A         0   19.0
#>  6 A         0   18.7
#>  7 A         0   23.6
#>  8 A         0   18.2
#>  9 A         0   18.0
#> 10 A         0   17.2
#> # … with 80 more rows

tests <-
  expand_grid(
    Group = c("B", "C"),
    Week = c(0, 12, 16)
  ) %>%
  mutate(
    test = list(Group, Week) %>% pmap(~ {
      data2 <- data %>%
        filter(Week == .y) %>%
        pivot_wider(names_from = Group, values_from = Weight)
        wilcox.test(data2[["A"]][[1]], data2[[.x]][[1]]) %>% tidy()
    })
  ) %>%
  unnest(test) %>%
  mutate(
    label = case_when(p.value < 0.01 ~ "**", p.value < 0.05 ~ "*", TRUE ~ "n.s.")
  ) %>%
  left_join(
    # position of labels
    data %>%
      group_by(Group, Week) %>%
      summarise(Weight = mean(Weight) - 30)
  )
#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates

#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates

#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates

#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates

#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates

#> Warning: Values are not uniquely identified; output will contain list-cols.
#> * Use `values_fn = list` to suppress this warning.
#> * Use `values_fn = length` to identify where the duplicates arise
#> * Use `values_fn = {summary_fun}` to summarise duplicates
#> `summarise()` has grouped output by 'Group'. You can override using the `.groups` argument.
#> Joining, by = c("Group", "Week")
tests
#> # A tibble: 6 x 8
#>   Group  Week statistic p.value method                  alternative label Weight
#>   <chr> <dbl>     <dbl>   <dbl> <chr>                   <chr>       <chr>  <dbl>
#> 1 B         0        27  0.0892 Wilcoxon rank sum exac… two.sided   n.s.   -8.48
#> 2 B        12        38  0.393  Wilcoxon rank sum exac… two.sided   n.s.    3.93
#> 3 B        16        63  0.353  Wilcoxon rank sum exac… two.sided   n.s.    2.86
#> 4 C         0        42  0.579  Wilcoxon rank sum exac… two.sided   n.s.   -9.58
#> 5 C        12        22  0.0355 Wilcoxon rank sum exac… two.sided   *       6.58
#> 6 C        16        17  0.0115 Wilcoxon rank sum exac… two.sided   *      11.3

data %>%
  ggline(x = "Week", y = "Weight", add = "mean_se", color = "Group") +
  geom_text(
    data = tests %>% mutate(Week = Week %>% factor()),
    mapping = aes(label = label, color = Group, y = 45),
    size = 10
  )

^{由 reprex package (v2.0.1)}

于 2021 年 12 月 1 日创建

如何对多个分组变量使用 stat_compare_means()

How to use stat_compare_means() on multiple grouping variables

r

ggplot2

ggpubr