组合多个汇总调用 dplyr
combinining multiple summarize calls dplyr
给定 df
ww <- data.frame(
GM = c("A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B",
"C", "C", "C", "C", "C", "C"),
stanza = rep(c("Past", "Mid", "End"), 6),
change = c(1, 1.1, 1.4, 1, 1.3, 1.5, 1, 1.2, 1.4,
1.1, 1.2, 1.3, .9, 1.2, 1.3, .9, 1.3, 1.5))
我想计算每个 GM 的过去的平均值,并将 'change' 中的每个值除以 GM 的特定平均值。我可以用两个 dplyr 调用和一个连接函数来做到这一点,如下所示:
past <- ww %>%
group_by(GM) %>%
filter(stanza == "Past") %>%
summarize(past.mean = mean(change))
ww <- left_join(ww, past, by = "GM")
ww %>%
group_by(GM, stanza) %>%
summarize(pr.change = change/past.mean)
但必须有一种方法可以在一个 dplyr 调用中完成此操作。
无需加入,直接在一个管链中计算即可:
ww %>%
group_by(GM) %>%
mutate(pr.change = change / mean(change[stanza == "Past"])) %>%
ungroup()
输出
GM stanza change pr.change
<chr> <chr> <dbl> <dbl>
1 A Past 1 1
2 A Mid 1.1 1.1
3 A End 1.4 1.4
4 A Past 1 1
5 A Mid 1.3 1.3
6 A End 1.5 1.5
7 B Past 1 0.952
8 B Mid 1.2 1.14
9 B End 1.4 1.33
10 B Past 1.1 1.05
11 B Mid 1.2 1.14
12 B End 1.3 1.24
13 C Past 0.9 1
14 C Mid 1.2 1.33
15 C End 1.3 1.44
16 C Past 0.9 1
17 C Mid 1.3 1.44
18 C End 1.5 1.67
使用base R
transform(ww, pr.change = change/ave(replace(change,
stanza != 'Past', NA), GM, FUN = function(x) mean(x, na.rm = TRUE)))
-输出
GM stanza change pr.change
1 A Past 1.0 1.000000
2 A Mid 1.1 1.100000
3 A End 1.4 1.400000
4 A Past 1.0 1.000000
5 A Mid 1.3 1.300000
6 A End 1.5 1.500000
7 B Past 1.0 0.952381
8 B Mid 1.2 1.142857
9 B End 1.4 1.333333
10 B Past 1.1 1.047619
11 B Mid 1.2 1.142857
12 B End 1.3 1.238095
13 C Past 0.9 1.000000
14 C Mid 1.2 1.333333
15 C End 1.3 1.444444
16 C Past 0.9 1.000000
17 C Mid 1.3 1.444444
18 C End 1.5 1.666667
一个data.table
解决方案:
library(data.table)
setDT(ww)
ww[, pr.change := change / mean(change[stanza == "Past"]), GM]
GM stanza change pr.change
1: A Past 1.0 1.000000
2: A Mid 1.1 1.100000
3: A End 1.4 1.400000
4: A Past 1.0 1.000000
5: A Mid 1.3 1.300000
6: A End 1.5 1.500000
7: B Past 1.0 0.952381
8: B Mid 1.2 1.142857
9: B End 1.4 1.333333
10: B Past 1.1 1.047619
11: B Mid 1.2 1.142857
12: B End 1.3 1.238095
13: C Past 0.9 1.000000
14: C Mid 1.2 1.333333
15: C End 1.3 1.444444
16: C Past 0.9 1.000000
17: C Mid 1.3 1.444444
18: C End 1.5 1.666667
给定 df
ww <- data.frame(
GM = c("A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B",
"C", "C", "C", "C", "C", "C"),
stanza = rep(c("Past", "Mid", "End"), 6),
change = c(1, 1.1, 1.4, 1, 1.3, 1.5, 1, 1.2, 1.4,
1.1, 1.2, 1.3, .9, 1.2, 1.3, .9, 1.3, 1.5))
我想计算每个 GM 的过去的平均值,并将 'change' 中的每个值除以 GM 的特定平均值。我可以用两个 dplyr 调用和一个连接函数来做到这一点,如下所示:
past <- ww %>%
group_by(GM) %>%
filter(stanza == "Past") %>%
summarize(past.mean = mean(change))
ww <- left_join(ww, past, by = "GM")
ww %>%
group_by(GM, stanza) %>%
summarize(pr.change = change/past.mean)
但必须有一种方法可以在一个 dplyr 调用中完成此操作。
无需加入,直接在一个管链中计算即可:
ww %>%
group_by(GM) %>%
mutate(pr.change = change / mean(change[stanza == "Past"])) %>%
ungroup()
输出
GM stanza change pr.change
<chr> <chr> <dbl> <dbl>
1 A Past 1 1
2 A Mid 1.1 1.1
3 A End 1.4 1.4
4 A Past 1 1
5 A Mid 1.3 1.3
6 A End 1.5 1.5
7 B Past 1 0.952
8 B Mid 1.2 1.14
9 B End 1.4 1.33
10 B Past 1.1 1.05
11 B Mid 1.2 1.14
12 B End 1.3 1.24
13 C Past 0.9 1
14 C Mid 1.2 1.33
15 C End 1.3 1.44
16 C Past 0.9 1
17 C Mid 1.3 1.44
18 C End 1.5 1.67
使用base R
transform(ww, pr.change = change/ave(replace(change,
stanza != 'Past', NA), GM, FUN = function(x) mean(x, na.rm = TRUE)))
-输出
GM stanza change pr.change
1 A Past 1.0 1.000000
2 A Mid 1.1 1.100000
3 A End 1.4 1.400000
4 A Past 1.0 1.000000
5 A Mid 1.3 1.300000
6 A End 1.5 1.500000
7 B Past 1.0 0.952381
8 B Mid 1.2 1.142857
9 B End 1.4 1.333333
10 B Past 1.1 1.047619
11 B Mid 1.2 1.142857
12 B End 1.3 1.238095
13 C Past 0.9 1.000000
14 C Mid 1.2 1.333333
15 C End 1.3 1.444444
16 C Past 0.9 1.000000
17 C Mid 1.3 1.444444
18 C End 1.5 1.666667
一个data.table
解决方案:
library(data.table)
setDT(ww)
ww[, pr.change := change / mean(change[stanza == "Past"]), GM]
GM stanza change pr.change
1: A Past 1.0 1.000000
2: A Mid 1.1 1.100000
3: A End 1.4 1.400000
4: A Past 1.0 1.000000
5: A Mid 1.3 1.300000
6: A End 1.5 1.500000
7: B Past 1.0 0.952381
8: B Mid 1.2 1.142857
9: B End 1.4 1.333333
10: B Past 1.1 1.047619
11: B Mid 1.2 1.142857
12: B End 1.3 1.238095
13: C Past 0.9 1.000000
14: C Mid 1.2 1.333333
15: C End 1.3 1.444444
16: C Past 0.9 1.000000
17: C Mid 1.3 1.444444
18: C End 1.5 1.666667