使用 group_by 对分组数据框应用 SummarizeGrowth
Apply SummarizeGrowth on a grouped dataframe using group_by
我正在尝试在使用 group_by 对数据帧 (df) 进行分组后应用函数 growthcurver::SummarizeGrowth。数据如此继续,直到 Time=96。这只是一个示例,展示了 df 的样子:
时间
细菌
隔离
实验
log10_OD600
0
一个
A1
1 月
-1
0
B
A1
1 月
-1
0
C
A1
1 月
-1
0
一个
A1
二月
-0,95
0
B
A1
二月
-0,98
0
C
A1
二月
-0,88
1
一个
A1
1 月
-0,86
1
B
A1
1 月
-0,88
1
C
A1
1 月
-0,85
2
一个
A1
1 月
-0,80
2
B
A1
1 月
-0,77
2
C
A1
1 月
-0,65
到目前为止,我已经尝试了下一个代码:
parameters <- df %>%
group_by(Bacteria, Isolate, Experiment) %>%
group_modify(~
growthcurver::SummarizeGrowth(
data_t = .x$Time,
data_n = .x$log10_OD600,
blank = NA))
下一个我也试过了:
f <- function(log10_OD600) SummarizeGrowth(df$Time, df$log10_OD600)
parameters<- df %>%
group_by(Bacteria, Isolate, Experiment) %>%
do(lapply(., f))
我的目标是使用逻辑模型参数获取数据框或列表,类似于:
细菌
隔离
实验
n0
r
t_mid
t_gen
auc_l
auc_e
一个
A1
1 月
0.33
1.8e-05
1.11
8.77
0.61
5.11
B
A1
1 月
0.35
1.8e-04
1.00
8.43
0.45
5.67
C
A1
1 月
0.25
1.6e-05
1.30
4.43
0.65
5.00
我们可以从 list
输出中提取 'vals' 和 select 那些特定元素
library(dplyr)
library(purrr)
df %>%
group_by(Bacteria, Isolate, Experiment) %>%
group_modify(~ growthcurver::SummarizeGrowth(
data_t = .x$Time,
data_n = .x$log10_OD600,
blank = NA) %>%
pluck("vals") %>%
{.[c("n0", "r", "t_mid", "t_gen", "auc_l", "auc_e")]} %>%
as.data.frame, .keep = TRUE ) %>%
ungroup
-输出
# A tibble: 6 x 9
Bacteria Isolate Experiment n0 r t_mid t_gen auc_l auc_e
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A A1 February 0 0 0 0 0 0
2 A A1 January 0 0 0 0 0 0
3 B A1 February 0 0 0 0 0 0
4 B A1 January 0 0 0 0 0 0
5 C A1 February 0 0 0 0 0 0
6 C A1 January 0 0 0 0 0 0
数据
df <- structure(list(Time = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 2L,
2L, 2L), Bacteria = c("A", "B", "C", "A", "B", "C", "A", "B",
"C", "A", "B", "C"), Isolate = c("A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A1", "A1", "A1", "A1", "A1"), Experiment = c("January",
"January", "January", "February", "February", "February", "January",
"January", "January", "January", "January", "January"), log10_OD600 = c(-1,
-1, -1, -0.95, -0.98, -0.88, -0.86, -0.88, -0.85, -0.8, -0.77,
-0.65)), row.names = c(NA, -12L), class = "data.frame")
我正在尝试在使用 group_by 对数据帧 (df) 进行分组后应用函数 growthcurver::SummarizeGrowth。数据如此继续,直到 Time=96。这只是一个示例,展示了 df 的样子:
时间 | 细菌 | 隔离 | 实验 | log10_OD600 |
---|---|---|---|---|
0 | 一个 | A1 | 1 月 | -1 |
0 | B | A1 | 1 月 | -1 |
0 | C | A1 | 1 月 | -1 |
0 | 一个 | A1 | 二月 | -0,95 |
0 | B | A1 | 二月 | -0,98 |
0 | C | A1 | 二月 | -0,88 |
1 | 一个 | A1 | 1 月 | -0,86 |
1 | B | A1 | 1 月 | -0,88 |
1 | C | A1 | 1 月 | -0,85 |
2 | 一个 | A1 | 1 月 | -0,80 |
2 | B | A1 | 1 月 | -0,77 |
2 | C | A1 | 1 月 | -0,65 |
到目前为止,我已经尝试了下一个代码:
parameters <- df %>%
group_by(Bacteria, Isolate, Experiment) %>%
group_modify(~
growthcurver::SummarizeGrowth(
data_t = .x$Time,
data_n = .x$log10_OD600,
blank = NA))
下一个我也试过了:
f <- function(log10_OD600) SummarizeGrowth(df$Time, df$log10_OD600)
parameters<- df %>%
group_by(Bacteria, Isolate, Experiment) %>%
do(lapply(., f))
我的目标是使用逻辑模型参数获取数据框或列表,类似于:
细菌 | 隔离 | 实验 | n0 | r | t_mid | t_gen | auc_l | auc_e |
---|---|---|---|---|---|---|---|---|
一个 | A1 | 1 月 | 0.33 | 1.8e-05 | 1.11 | 8.77 | 0.61 | 5.11 |
B | A1 | 1 月 | 0.35 | 1.8e-04 | 1.00 | 8.43 | 0.45 | 5.67 |
C | A1 | 1 月 | 0.25 | 1.6e-05 | 1.30 | 4.43 | 0.65 | 5.00 |
我们可以从 list
输出中提取 'vals' 和 select 那些特定元素
library(dplyr)
library(purrr)
df %>%
group_by(Bacteria, Isolate, Experiment) %>%
group_modify(~ growthcurver::SummarizeGrowth(
data_t = .x$Time,
data_n = .x$log10_OD600,
blank = NA) %>%
pluck("vals") %>%
{.[c("n0", "r", "t_mid", "t_gen", "auc_l", "auc_e")]} %>%
as.data.frame, .keep = TRUE ) %>%
ungroup
-输出
# A tibble: 6 x 9
Bacteria Isolate Experiment n0 r t_mid t_gen auc_l auc_e
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A A1 February 0 0 0 0 0 0
2 A A1 January 0 0 0 0 0 0
3 B A1 February 0 0 0 0 0 0
4 B A1 January 0 0 0 0 0 0
5 C A1 February 0 0 0 0 0 0
6 C A1 January 0 0 0 0 0 0
数据
df <- structure(list(Time = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 2L,
2L, 2L), Bacteria = c("A", "B", "C", "A", "B", "C", "A", "B",
"C", "A", "B", "C"), Isolate = c("A1", "A1", "A1", "A1", "A1",
"A1", "A1", "A1", "A1", "A1", "A1", "A1"), Experiment = c("January",
"January", "January", "February", "February", "February", "January",
"January", "January", "January", "January", "January"), log10_OD600 = c(-1,
-1, -1, -0.95, -0.98, -0.88, -0.86, -0.88, -0.85, -0.8, -0.77,
-0.65)), row.names = c(NA, -12L), class = "data.frame")