使用 group_by 对分组数据框应用 SummarizeGrowth

Apply SummarizeGrowth on a grouped dataframe using group_by

我正在尝试在使用 group_by 对数据帧 (df) 进行分组后应用函数 growthcurver::SummarizeGrowth。数据如此继续,直到 Time=96。这只是一个示例,展示了 df 的样子:

时间 细菌 隔离 实验 log10_OD600
0 一个 A1 1 月 -1
0 B A1 1 月 -1
0 C A1 1 月 -1
0 一个 A1 二月 -0,95
0 B A1 二月 -0,98
0 C A1 二月 -0,88
1 一个 A1 1 月 -0,86
1 B A1 1 月 -0,88
1 C A1 1 月 -0,85
2 一个 A1 1 月 -0,80
2 B A1 1 月 -0,77
2 C A1 1 月 -0,65

到目前为止,我已经尝试了下一个代码:

parameters <- df %>%
           group_by(Bacteria, Isolate, Experiment) %>%
           group_modify(~
                growthcurver::SummarizeGrowth(
                  data_t = .x$Time, 
                  data_n = .x$log10_OD600, 
                  blank = NA))

下一个我也试过了:

f <- function(log10_OD600) SummarizeGrowth(df$Time, df$log10_OD600)

parameters<- df %>%
          group_by(Bacteria, Isolate, Experiment) %>%
          do(lapply(., f))

我的目标是使用逻辑模型参数获取数据框或列表,类似于:

细菌 隔离 实验 n0 r t_mid t_gen auc_l auc_e
一个 A1 1 月 0.33 1.8e-05 1.11 8.77​​ 0.61 5.11
B A1 1 月 0.35 1.8e-04 1.00 8.43 0.45 5.67
C A1 1 月 0.25 1.6e-05 1.30 4.43 0.65 5.00

我们可以从 list 输出中提取 'vals' 和 select 那些特定元素

library(dplyr)
library(purrr)
df %>%
     group_by(Bacteria, Isolate, Experiment) %>% 
     group_modify(~ growthcurver::SummarizeGrowth(
                  data_t = .x$Time, 
                  data_n = .x$log10_OD600, 
                  blank = NA) %>% 
          pluck("vals") %>% 
          {.[c("n0", "r", "t_mid", "t_gen", "auc_l", "auc_e")]} %>% 
      as.data.frame, .keep = TRUE )  %>%
    ungroup

-输出

# A tibble: 6 x 9
  Bacteria Isolate Experiment    n0     r t_mid t_gen auc_l auc_e
  <chr>    <chr>   <chr>      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A        A1      February       0     0     0     0     0     0
2 A        A1      January        0     0     0     0     0     0
3 B        A1      February       0     0     0     0     0     0
4 B        A1      January        0     0     0     0     0     0
5 C        A1      February       0     0     0     0     0     0
6 C        A1      January        0     0     0     0     0     0

数据

df <- structure(list(Time = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 2L, 
2L, 2L), Bacteria = c("A", "B", "C", "A", "B", "C", "A", "B", 
"C", "A", "B", "C"), Isolate = c("A1", "A1", "A1", "A1", "A1", 
"A1", "A1", "A1", "A1", "A1", "A1", "A1"), Experiment = c("January", 
"January", "January", "February", "February", "February", "January", 
"January", "January", "January", "January", "January"), log10_OD600 = c(-1, 
-1, -1, -0.95, -0.98, -0.88, -0.86, -0.88, -0.85, -0.8, -0.77, 
-0.65)), row.names = c(NA, -12L), class = "data.frame")