如何在单个命令中组合两个不同的 dplyr 摘要

Question

我正在尝试创建一个分组摘要，报告每组中的记录数，然后还显示一系列变量的均值。

我只能想出如何将其作为两个单独的摘要来完成，然后将它们结合在一起。这工作正常，但我想知道是否有更优雅的方法来做到这一点？

dailyn<-daily %>% # this summarises n
  group_by(type) %>%
  summarise(n=n()) %>%

dailymeans <- daily %>% # this summarises the means
  group_by(type) %>%
  summarise_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%

dailysummary<-inner_join(dailyn,dailymeans) #this joins the two parts together

我正在处理的数据是这样的数据框：

daily<-data.frame(type=c("A","A","B","C","C","C"),
                  d.happy=c(1,5,3,7,2,4),
                  d.sad=c(5,3,6,3,1,2))

Answer 1

类似，你可以试试：

daily %>% 
  group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>%
  unique

给出：

Source: local data frame [3 x 4]
Groups: type [3]

    type  d.happy d.sad     n
  <fctr>    <dbl> <dbl> <int>
1      A 3.000000     4     2
2      B 3.000000     6     1
3      C 4.333333     2     3

Answer 2

您可以通过分组，使用 mutate 而不是 summarize，然后使用 slice() 保留每种类型的第一行，在一次调用中完成此操作：

daily %>% group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs(mean(., na.rm = TRUE))) %>% 
  slice(1L)

编辑：在这个修改后的示例中，它的工作原理可能会更清楚

daily_summary <- daily %>% group_by(type) %>% 
  mutate(n = n()) %>% 
  mutate_at(vars(starts_with("d.")),funs("mean" = mean(., na.rm = TRUE)))

daily_summary
# Source: local data frame [6 x 6]
# Groups: type [3]
# 
# # A tibble: 6 x 6
#    type d.happy d.sad     n d.happy_mean d.sad_mean
#  <fctr>   <dbl> <dbl> <int>        <dbl>      <dbl>
#1      A       1     5     2     3.000000          4
#2      A       5     3     2     3.000000          4
#3      B       3     6     1     3.000000          6
#4      C       7     3     3     4.333333          2
#5      C       2     1     3     4.333333          2
#6      C       4     2     3     4.333333          2

daily_summary %>% 
  slice(1L)

# Source: local data frame [3 x 6]
# Groups: type [3]
# 
# # A tibble: 3 x 6
#    type d.happy d.sad     n d.happy_mean d.sad_mean
#  <fctr>   <dbl> <dbl> <int>        <dbl>      <dbl>
#1      A       1     5     2     3.000000          4
#2      B       3     6     1     3.000000          6
#3      C       7     3     3     4.333333          2

如何在单个命令中组合两个不同的 dplyr 摘要

How to combine two different dplyr summaries in a single command

r

dplyr

summarize