使用 Customer ID/Item 等多个变量确定销售趋势

Determine Sales trend with multiple variables like Customer ID/Item etc

我在研究趋势时遇到了困难。我的问题与下面的线程类似,但我有一个名为 'item' 的额外变量。

How to determine trend of time-series of values in R

我的最终结果将如下例所示。请帮助

Customer_ID Item    Sales_Slope  
Josh        milk      Positive
Josh         eggs      Negative
Eric         milk      Mixed
Eric         eggs      postive

我的数据:

require("data.table")
dat <- data.table(
            customer_ID=c(rep("Josh",6),rep("Ray",7),rep("Eric",7)),
            item=c(rep("milk",3),rep("eggs",3),rep("milk",4),rep("eggs",3),rep("milk",3),rep("eggs",4)),
            sales=c(35,50,65,65,52,49,15,10,13,9,35,50,65,65,52,49,15,10,13,9))

dat[,transaction_num:=seq(1,.N), by=c("customer_ID")]

我同意@smci 的观点,从 link 开始的所有变化是 "by" 变量增加了。我希望这个解决方案能让人明白

> library(plyr)
> abc <- function(x){
   if(all(diff(x$sales)>0)) return('Positive')
   if(all(diff(x$sales)<0)) return('Negative')
   return('Mixed')
  }

 y= ddply(dat, .(customer_ID, item), abc)
 y
  customer_ID item       V1
1        Eric eggs    Mixed
2        Eric milk Negative
3        Josh eggs Negative
4        Josh milk Positive
5         Ray eggs Positive
6         Ray milk    Mixed

我概述的 data.table 方法是:

require(data.table)

trend <- function(x) {
   ifelse(all(diff(x)>0), 'Positive',
   ifelse(all(diff(x)<0), 'Negative', 'Mixed'))
}

dat[, trend(sales), by=c("customer_ID","item")]
   customer_ID item       V1
1:        Josh milk Positive
2:        Josh eggs Negative
3:         Ray milk    Mixed
4:         Ray eggs Positive
5:        Eric milk Negative
6:        Eric eggs    Mixed

# or if you want to assign the result...
dat[, Sales_Slope:=trend(sales), by=c("customer_ID","item")]