R中每个客户和产品的日期差异

Date difference for each customer and product in R

custid <- c(1,2,2,2) 

prod <- c("books", "highlighters", "books", "pens" )

qdate <- c(20130401,  20130403, 20130403, 20130404) 

tdate <- c(20130405,  20130804, 20130405, 20130405)

data <- data.frame(custid, prod, qdate, tdate)

  data$qdate <- as.Date(as.character(data$qdate), "%Y%m%d") 
  data$tdate <- as.Date(as.character(data$tdate), "%Y%m%d") 

(data2 <- difftime(data$tdate, data$qdate, data$custid, units="days")) #works

data2 <- aggregate(cbind(data$tdate=format(date, '%Y-%m-%d'))~cbind(data$qdate=format(date, '%Y-%m-%d'))  + data$prod + data$custid, data, difftime(data$tdate, data$qdate, data$custid, units="days"))

对于上面的 R 代码,我尝试使用聚合函数来查找如下所示的输出。 difftime 正确给出天数差异。但是,聚合函数不起作用并导致错误。有没有人知道如何解决这个问题?谢谢

custid  prod            qdate       tdate       days_difference
1       books           20130401    20130405    4
2       highlighters    20130403    20130804    123
2       books           20130403    20130405    2
2       pens            20130404    20130405    1

你可以通过开始使用 lubridate

使这变得如此简单
library(lubridate)
custid <- c(1,2,2,2) 

prod <- c("books", "highlighters", "books", "pens" )

# ymd = year, month, day
qdate <- ymd(c(20130401,  20130403, 20130403, 20130404))

tdate <- ymd(c(20130405,  20130804, 20130405, 20130405))

data <- data.frame(custid, prod, qdate, tdate)
data$days_difference <- with(data, difftime(tdate, qdate, units="days"))
data
  custid         prod      qdate      tdate days_difference
1      1        books 2013-04-01 2013-04-05          4 days
2      2 highlighters 2013-04-03 2013-08-04        123 days
3      2        books 2013-04-03 2013-04-05          2 days
4      2         pens 2013-04-04 2013-04-05          1 days

编辑

如果您不想在列中使用 'days',请使用 as.numeric

data$days_difference <- as.numeric(with(data, difftime(tdate, qdate, custid, units="days")))
  custid         prod      qdate      tdate days_difference
1      1        books 2013-04-01 2013-04-05               4
2      2 highlighters 2013-04-03 2013-08-04             123
3      2        books 2013-04-03 2013-04-05               2
4      2         pens 2013-04-04 2013-04-05               1

您不需要 aggregate() 进行逐行计算。您可以在 "Date" classed 对象上使用一元 - 运算符。将其包裹在 c() 中以删除 "difftime" class.

within(data, day_diff <- c(tdate - qdate))
#   custid         prod      qdate      tdate day_diff
# 1      1        books 2013-04-01 2013-04-05        4
# 2      2 highlighters 2013-04-03 2013-08-04      123
# 3      2        books 2013-04-03 2013-04-05        2
# 4      2         pens 2013-04-04 2013-04-05        1