Groupby 一列并找到它的总和和计数

Groupby a column and find its sum and count

背景: 我有一个数据集,df,

  Date                          Duration


 1/2/2020 5:00:00 PM            20
 1/2/2020 5:30:01 PM            30
 1/2/2020 6:00:00 PM            10
 1/5/2020 7:00:01 AM            5
 1/6/2020 8:00:00 AM            2
 1/6/2020 9:00:00 AM            8

期望输出:

 Date                 Total_Duration         Count

1/2/2020                60                     3
1/5/2020                5                      1
1/6/2020                10                     2

输入:

 structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM", 
 "1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM", 
 "1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"), 
 Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA, 
-6L))

我试过的:

 library(dplyr)
 df %>% group_by(Date)  %>% add_tally() %>%
 summarize(Duration) 

任何指导都会有所帮助。

dmy_hms转换为'DateTime'后,我们可以从'Date'中得到Date唯一的部分(假设格式为DD/MM/YYYYY HH::MM:SS),使用作为分组变量并获得 'Duration' 的 sum 和 'Count' 作为 n()

library(dplyr)
library(lubridate)
df %>%
    group_by(Date = as.Date(dmy_hms(Date))) %>% 
    summarise(Total_Duration = sum(Duration), Count = n())
# A tibble: 3 x 3
#  Date       Total_Duration Count
#  <date>              <int> <int>
#1 2020-02-01             60     3
#2 2020-05-01              5     1
#3 2020-06-01             10     2