dplyr:计算汇总组之间的百分比变化

dplyr: Calculate percent change between summarized groups

我正在尝试计算组之间的百分比变化,在我的 data.frame 中,有一个控制组和多个治疗组。由于我有很多观察,我使用 dplyr。我不明白的是,如何有效地设置要比较的组?通常,我会将此任务分成多个步骤:

不过,我想知道dplyr是否已经有更简单直接的方法了?

虚拟示例

set.seed(5)
dd <- data.frame(id = rep(c(1:4), 3),
                 val = c(rnorm(4) +2,
                         rnorm(4) +3,
                         rnorm(4) +4),
                 grp = rep(c("control", "ch1", "ch2"), each = 4))

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val)) 

'control' 和个体治疗之间的计算百分比变化的预期结果:

# A tibble: 3 x 2
  grp     my_mean   perc_change
  <fct>     <dbl>
1 ch1        2.30    XX
2 ch2        5.00    YY
3 control    1.39    0

你在找这个吗?

library(dplyr)

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val))  %>%
  mutate(perc_change = (my_mean - my_mean[match('control', grp)])/ my_mean[match('control', grp)] * 100)
  #Also we can use '=='
  #mutate(perc_change = (my_mean - my_mean[grp == 'control'])/ my_mean[grp == 'control'] * 100)

你想要这个吗?

library(tidyverse)
set.seed(5)
dd <- data.frame(id = rep(c(1:4), 3),
                 val = c(rnorm(4) +2,
                         rnorm(4) +3,
                         rnorm(4) +4),
                 grp = rep(c("control", "ch1", "ch2"), each = 4))

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val)) %>%
  mutate(perc_change = scales::percent((my_mean - my_mean[grp == 'control'])/my_mean[grp == 'control']))
#> # A tibble: 3 x 3
#>   grp     my_mean perc_change
#>   <chr>     <dbl> <chr>      
#> 1 ch1        3.00 63%        
#> 2 ch2        4.07 121%       
#> 3 control    1.84 0%

reprex package (v2.0.0)

创建于 2021-07-31