R按日期范围子集数据框
R Subsetting a dataframe by date-range
我想按日期范围组织数据框。
假设今天是 2017 年 1 月 1 日,下面的 table 显示:
三种产品(苹果、香蕉和啤酒)
五个到期日(1/15/2017、2/27/2017、3/15/2017、9/1/2017 和 1/10/2018)
Product Type 1/15/2017 2/27/2017 3/15/2017 9/1/2017 12/20/2017 1/10/2018
Apple 3 10 - 2 8 -
Banana 5 50 100 10 10 2
Beer 1 1 1 1 1 1
你可以把上面的table读成"the shop manager has 3 apples with an expiry date of 1/15/2017, 10 other apples that can last longer and have an expiry date of 2/27/2017 etc."
店长想知道有多少苹果会在不到 1 个月、1 到 3 个月、3 到 12 个月和超过 12 个月后过期。
请问我如何在 R 中编写代码?
结果 table 将如下所示:
Product Type Less than 1mth 1-3mths 3-12 mths More than 12mths
Apple 3 10 10 -
Banana 5 150 20 2
Beer 1 2 2 1
非常感谢您的帮助!
使用函数 tidyverse
和 lubridate
的解决方案。 dt2
是最终输出。
dt <- read.table(text = "'Product Type' '1/15/2017' '2/27/2017' '3/15/2017' '9/1/2017' '12/20/2017' '1/10/2018'
Apple 3 10 - 2 8 -
Banana 5 50 100 10 10 2
Beer 1 1 1 1 1 1",
header = TRUE, stringsAsFactors = FALSE, na.strings = "-")
library(tidyverse)
library(lubridate)
dt2 <- dt %>%
gather(Date, Value, -Product.Type) %>%
mutate(Date = sub("X", "", Date, fixed = TRUE)) %>%
mutate(Date = mdy(Date)) %>%
mutate(Day_Diff = Date - mdy("1/1/2017")) %>%
mutate(Group = case_when(
Day_Diff <= 30 ~ "Less than 1mth",
Day_Diff <= 90 ~ "1-3mths",
Day_Diff <= 361 ~ "3-12 mths",
TRUE ~ "More than 12mths"
)) %>%
group_by(Product.Type, Group) %>%
summarise(Value = sum(Value, na.rm = TRUE)) %>%
spread(Group, Value) %>%
select(`Product Type` = Product.Type, `Less than 1mth`, `1-3mths`,
`3-12 mths`, `More than 12mths`)
data.table
回答:
library(data.table)
dt <- data.table(type=c("apple", "banana", "beer"),
`2017-01-15`=c(3,5,1),
`2017-02-27`=c(10,50,1),
`2017-03-15`=c(NA, 100, 1),
`2017-09-01`=c(2,10,1),
`2017-12-20`=c(8,10,1),
`2018-01-10`=c(NA, 2, 1))
dt2 <- melt(dt, id.vars=c("type"))
dt2[, days_until_expires:=as.IDate(variable) - as.IDate("2017-01-01")]
dt2[, days_until_expires_f:=cut(days_until_expires, c(0, 30, 90, 360, Inf))]
out1 <- dt2[, list(N=sum(value, na.rm=T)), by=list(type, days_until_expires_f)]
out2 <- dcast(out1, type ~ days_until_expires_f, value.var="N")
out2
是您的输出。
将来,您可以通过提供完整的最小工作示例 (MWE) 来让用户更轻松地帮助您。有关指导,请参阅 here。
我想按日期范围组织数据框。
假设今天是 2017 年 1 月 1 日,下面的 table 显示:
三种产品(苹果、香蕉和啤酒)
五个到期日(1/15/2017、2/27/2017、3/15/2017、9/1/2017 和 1/10/2018)
Product Type 1/15/2017 2/27/2017 3/15/2017 9/1/2017 12/20/2017 1/10/2018 Apple 3 10 - 2 8 - Banana 5 50 100 10 10 2 Beer 1 1 1 1 1 1
你可以把上面的table读成"the shop manager has 3 apples with an expiry date of 1/15/2017, 10 other apples that can last longer and have an expiry date of 2/27/2017 etc."
店长想知道有多少苹果会在不到 1 个月、1 到 3 个月、3 到 12 个月和超过 12 个月后过期。
请问我如何在 R 中编写代码? 结果 table 将如下所示:
Product Type Less than 1mth 1-3mths 3-12 mths More than 12mths Apple 3 10 10 - Banana 5 150 20 2 Beer 1 2 2 1
非常感谢您的帮助!
使用函数 tidyverse
和 lubridate
的解决方案。 dt2
是最终输出。
dt <- read.table(text = "'Product Type' '1/15/2017' '2/27/2017' '3/15/2017' '9/1/2017' '12/20/2017' '1/10/2018'
Apple 3 10 - 2 8 -
Banana 5 50 100 10 10 2
Beer 1 1 1 1 1 1",
header = TRUE, stringsAsFactors = FALSE, na.strings = "-")
library(tidyverse)
library(lubridate)
dt2 <- dt %>%
gather(Date, Value, -Product.Type) %>%
mutate(Date = sub("X", "", Date, fixed = TRUE)) %>%
mutate(Date = mdy(Date)) %>%
mutate(Day_Diff = Date - mdy("1/1/2017")) %>%
mutate(Group = case_when(
Day_Diff <= 30 ~ "Less than 1mth",
Day_Diff <= 90 ~ "1-3mths",
Day_Diff <= 361 ~ "3-12 mths",
TRUE ~ "More than 12mths"
)) %>%
group_by(Product.Type, Group) %>%
summarise(Value = sum(Value, na.rm = TRUE)) %>%
spread(Group, Value) %>%
select(`Product Type` = Product.Type, `Less than 1mth`, `1-3mths`,
`3-12 mths`, `More than 12mths`)
data.table
回答:
library(data.table)
dt <- data.table(type=c("apple", "banana", "beer"),
`2017-01-15`=c(3,5,1),
`2017-02-27`=c(10,50,1),
`2017-03-15`=c(NA, 100, 1),
`2017-09-01`=c(2,10,1),
`2017-12-20`=c(8,10,1),
`2018-01-10`=c(NA, 2, 1))
dt2 <- melt(dt, id.vars=c("type"))
dt2[, days_until_expires:=as.IDate(variable) - as.IDate("2017-01-01")]
dt2[, days_until_expires_f:=cut(days_until_expires, c(0, 30, 90, 360, Inf))]
out1 <- dt2[, list(N=sum(value, na.rm=T)), by=list(type, days_until_expires_f)]
out2 <- dcast(out1, type ~ days_until_expires_f, value.var="N")
out2
是您的输出。
将来,您可以通过提供完整的最小工作示例 (MWE) 来让用户更轻松地帮助您。有关指导,请参阅 here。