如何在单个命令中组合两个不同的 dplyr 摘要
How to combine two different dplyr summaries in a single command
我正在尝试创建一个分组摘要,报告每组中的记录数,然后还显示一系列变量的均值。
我只能想出如何将其作为两个单独的摘要来完成,然后将它们结合在一起。这工作正常,但我想知道是否有更优雅的方法来做到这一点?
dailyn<-daily %>% # this summarises n
group_by(type) %>%
summarise(n=n()) %>%
dailymeans <- daily %>% # this summarises the means
group_by(type) %>%
summarise_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
dailysummary<-inner_join(dailyn,dailymeans) #this joins the two parts together
我正在处理的数据是这样的数据框:
daily<-data.frame(type=c("A","A","B","C","C","C"),
d.happy=c(1,5,3,7,2,4),
d.sad=c(5,3,6,3,1,2))
类似,你可以试试:
daily %>%
group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
unique
给出:
Source: local data frame [3 x 4]
Groups: type [3]
type d.happy d.sad n
<fctr> <dbl> <dbl> <int>
1 A 3.000000 4 2
2 B 3.000000 6 1
3 C 4.333333 2 3
您可以通过分组,使用 mutate 而不是 summarize,然后使用 slice() 保留每种类型的第一行,在一次调用中完成此操作:
daily %>% group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
slice(1L)
编辑:在这个修改后的示例中,它的工作原理可能会更清楚
daily_summary <- daily %>% group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs("mean" = mean(., na.rm = TRUE)))
daily_summary
# Source: local data frame [6 x 6]
# Groups: type [3]
#
# # A tibble: 6 x 6
# type d.happy d.sad n d.happy_mean d.sad_mean
# <fctr> <dbl> <dbl> <int> <dbl> <dbl>
#1 A 1 5 2 3.000000 4
#2 A 5 3 2 3.000000 4
#3 B 3 6 1 3.000000 6
#4 C 7 3 3 4.333333 2
#5 C 2 1 3 4.333333 2
#6 C 4 2 3 4.333333 2
daily_summary %>%
slice(1L)
# Source: local data frame [3 x 6]
# Groups: type [3]
#
# # A tibble: 3 x 6
# type d.happy d.sad n d.happy_mean d.sad_mean
# <fctr> <dbl> <dbl> <int> <dbl> <dbl>
#1 A 1 5 2 3.000000 4
#2 B 3 6 1 3.000000 6
#3 C 7 3 3 4.333333 2
我正在尝试创建一个分组摘要,报告每组中的记录数,然后还显示一系列变量的均值。
我只能想出如何将其作为两个单独的摘要来完成,然后将它们结合在一起。这工作正常,但我想知道是否有更优雅的方法来做到这一点?
dailyn<-daily %>% # this summarises n
group_by(type) %>%
summarise(n=n()) %>%
dailymeans <- daily %>% # this summarises the means
group_by(type) %>%
summarise_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
dailysummary<-inner_join(dailyn,dailymeans) #this joins the two parts together
我正在处理的数据是这样的数据框:
daily<-data.frame(type=c("A","A","B","C","C","C"),
d.happy=c(1,5,3,7,2,4),
d.sad=c(5,3,6,3,1,2))
类似
daily %>%
group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
unique
给出:
Source: local data frame [3 x 4]
Groups: type [3]
type d.happy d.sad n
<fctr> <dbl> <dbl> <int>
1 A 3.000000 4 2
2 B 3.000000 6 1
3 C 4.333333 2 3
您可以通过分组,使用 mutate 而不是 summarize,然后使用 slice() 保留每种类型的第一行,在一次调用中完成此操作:
daily %>% group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
slice(1L)
编辑:在这个修改后的示例中,它的工作原理可能会更清楚
daily_summary <- daily %>% group_by(type) %>%
mutate(n = n()) %>%
mutate_at(vars(starts_with("d.")),funs("mean" = mean(., na.rm = TRUE)))
daily_summary
# Source: local data frame [6 x 6]
# Groups: type [3]
#
# # A tibble: 6 x 6
# type d.happy d.sad n d.happy_mean d.sad_mean
# <fctr> <dbl> <dbl> <int> <dbl> <dbl>
#1 A 1 5 2 3.000000 4
#2 A 5 3 2 3.000000 4
#3 B 3 6 1 3.000000 6
#4 C 7 3 3 4.333333 2
#5 C 2 1 3 4.333333 2
#6 C 4 2 3 4.333333 2
daily_summary %>%
slice(1L)
# Source: local data frame [3 x 6]
# Groups: type [3]
#
# # A tibble: 3 x 6
# type d.happy d.sad n d.happy_mean d.sad_mean
# <fctr> <dbl> <dbl> <int> <dbl> <dbl>
#1 A 1 5 2 3.000000 4
#2 B 3 6 1 3.000000 6
#3 C 7 3 3 4.333333 2