R按日期范围子集数据框

R Subsetting a dataframe by date-range

我想按日期范围组织数据框。

假设今天是 2017 年 1 月 1 日,下面的 table 显示:

Product Type    1/15/2017   2/27/2017   3/15/2017   9/1/2017    12/20/2017  1/10/2018
Apple           3           10          -           2           8           -   
Banana          5           50          100         10          10          2 
Beer            1           1           1           1           1           1   

你可以把上面的table读成"the shop manager has 3 apples with an expiry date of 1/15/2017, 10 other apples that can last longer and have an expiry date of 2/27/2017 etc."

店长想知道有多少苹果会在不到 1 个月、1 到 3 个月、3 到 12 个月和超过 12 个月后过期。

请问我如何在 R 中编写代码? 结果 table 将如下所示:

Product Type     Less than 1mth    1-3mths       3-12 mths       More than 12mths 
Apple            3                 10            10              -   
Banana           5                 150           20              2 
Beer             1                 2             2               1   

非常感谢您的帮助!

使用函数 tidyverselubridate 的解决方案。 dt2 是最终输出。

dt <- read.table(text = "'Product Type'    '1/15/2017'   '2/27/2017'   '3/15/2017'   '9/1/2017'    '12/20/2017'  '1/10/2018'
Apple           3           10          -           2           8           -   
                 Banana          5           50          100         10          10          2 
                 Beer            1           1           1           1           1           1",
                 header = TRUE, stringsAsFactors = FALSE, na.strings = "-")


library(tidyverse)
library(lubridate)

dt2 <- dt %>%
  gather(Date, Value, -Product.Type) %>%
  mutate(Date = sub("X", "", Date, fixed = TRUE)) %>%
  mutate(Date = mdy(Date)) %>%
  mutate(Day_Diff = Date - mdy("1/1/2017")) %>%
  mutate(Group = case_when(
    Day_Diff <= 30  ~ "Less than 1mth",
    Day_Diff <= 90  ~ "1-3mths",
    Day_Diff <= 361 ~ "3-12 mths",
    TRUE            ~ "More than 12mths"
  )) %>%
  group_by(Product.Type, Group) %>%
  summarise(Value = sum(Value, na.rm = TRUE)) %>%
  spread(Group, Value) %>%
  select(`Product Type` = Product.Type, `Less than 1mth`, `1-3mths`, 
         `3-12 mths`, `More than 12mths`)

data.table 回答:

library(data.table)

dt <- data.table(type=c("apple", "banana", "beer"), 
                 `2017-01-15`=c(3,5,1),
                 `2017-02-27`=c(10,50,1),
                 `2017-03-15`=c(NA, 100, 1),
                 `2017-09-01`=c(2,10,1),
                 `2017-12-20`=c(8,10,1),
                 `2018-01-10`=c(NA, 2, 1))

dt2 <- melt(dt, id.vars=c("type"))
dt2[, days_until_expires:=as.IDate(variable) - as.IDate("2017-01-01")]
dt2[, days_until_expires_f:=cut(days_until_expires, c(0, 30, 90, 360, Inf))]

out1 <- dt2[, list(N=sum(value, na.rm=T)), by=list(type, days_until_expires_f)]
out2 <- dcast(out1, type ~ days_until_expires_f, value.var="N")

out2 是您的输出。

将来,您可以通过提供完整的最小工作示例 (MWE) 来让用户更轻松地帮助您。有关指导,请参阅 here