Groupby 一列并找到它的总和和计数
Groupby a column and find its sum and count
背景:
我有一个数据集,df,
Date Duration
1/2/2020 5:00:00 PM 20
1/2/2020 5:30:01 PM 30
1/2/2020 6:00:00 PM 10
1/5/2020 7:00:01 AM 5
1/6/2020 8:00:00 AM 2
1/6/2020 9:00:00 AM 8
期望输出:
Date Total_Duration Count
1/2/2020 60 3
1/5/2020 5 1
1/6/2020 10 2
输入:
structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM",
"1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM",
"1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"),
Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA,
-6L))
我试过的:
library(dplyr)
df %>% group_by(Date) %>% add_tally() %>%
summarize(Duration)
任何指导都会有所帮助。
用dmy_hms
转换为'DateTime'后,我们可以从'Date'中得到Date
唯一的部分(假设格式为DD/MM/YYYYY HH::MM:SS
),使用作为分组变量并获得 'Duration' 的 sum
和 'Count' 作为 n()
library(dplyr)
library(lubridate)
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
# A tibble: 3 x 3
# Date Total_Duration Count
# <date> <int> <int>
#1 2020-02-01 60 3
#2 2020-05-01 5 1
#3 2020-06-01 10 2
背景: 我有一个数据集,df,
Date Duration
1/2/2020 5:00:00 PM 20
1/2/2020 5:30:01 PM 30
1/2/2020 6:00:00 PM 10
1/5/2020 7:00:01 AM 5
1/6/2020 8:00:00 AM 2
1/6/2020 9:00:00 AM 8
期望输出:
Date Total_Duration Count
1/2/2020 60 3
1/5/2020 5 1
1/6/2020 10 2
输入:
structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM",
"1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM",
"1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"),
Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA,
-6L))
我试过的:
library(dplyr)
df %>% group_by(Date) %>% add_tally() %>%
summarize(Duration)
任何指导都会有所帮助。
用dmy_hms
转换为'DateTime'后,我们可以从'Date'中得到Date
唯一的部分(假设格式为DD/MM/YYYYY HH::MM:SS
),使用作为分组变量并获得 'Duration' 的 sum
和 'Count' 作为 n()
library(dplyr)
library(lubridate)
df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>%
summarise(Total_Duration = sum(Duration), Count = n())
# A tibble: 3 x 3
# Date Total_Duration Count
# <date> <int> <int>
#1 2020-02-01 60 3
#2 2020-05-01 5 1
#3 2020-06-01 10 2