将普通数据集转换为购物篮分析可处理格式

convert normal data set to market basket analysis process-able format

我创建了一个数据集,如下所示,用于应用购物篮分析 (apriori())

id  name
1   mango
1   apple
1   grapes
2   apple
2   carrot
3   mango
3   apple
4   apple
4   carrot
4   grapes
5   strawberry
6   guava
6   strawberry
6   bananas
7   bananas
8   guava
8   strawberry
8   pineapple
9   mango
9   apple
9   blueberries
10  black grapes
11  pomogranate
12  black grapes
12  pomogranate
12  carrot
12  custard apple

我应用了一些逻辑将其转换为购物篮分析可处理数据。

library(arules)
fact <- data.frame(lapply(frt,as.factor))
trans <- as(fact, 'transactions') 

我也试过这个,但出现错误。

trans1 = read.transactions(file = frt, format = "single", sep = ",",cols=c("id","name"))

Error in scan(file = file, what = "", sep = sep, quiet = TRUE, nlines = 1) : 
  'file' must be a character string or connection

我得到的输出不符合预期。 我得到的输出。

items                transactionID
1   {name=mango}                   1  
2   {name=apple}                   2  
3   {name=grapes}                  3  
4   {name=apple}                   4  
5   {name=carrot}                  5  
6   {name=mango}                   6  
7   {name=apple}                   7  
8   {name=apple}                   8  
9   {name=carrot}                  9  
10  {name=grapes}                  10 
11  {name=strawberry}              11 
12  {name=guava}                   12 
13  {name=strawberry}              13 
14  {name=bananas}                 14 

我的预期输出是

id  item
1  {mango,apple,grapes)
2  {apple,carrot}
3  {mango,apple}

以此类推

所以任何人都可以帮助获得我预期的输出(如果可能的话)

so that it helps me to apply apriori() algorithm.

提前致谢。

如果您在 arules 中进行购物篮分析,则需要构建一个 transactions。您可以从文本文件中执行此操作,例如:

write.csv(frt,file="temp.csv", row.names=FALSE) # say "temp.csv" is your text file
tranx <- read.transactions(file="temp.csv",format="single", sep=",", cols=c("id","name"))
inspect(tranx)
#     items           transactionID
# 1  {apple,                      
#     grapes,                     
#     mango}                    1 
# 2  {black-grapes}             10
# 3  {pomogranate}              11
# 4  {black-grapes,               
#     carrot,                     
#     custard-apple,              
#     pomogranate}              12

...,如果您已经将文本文件读入data.frame,您可以通过列表将其强制转换为transactions像这样的对象:

tranx2 <- list()
for(i in unique(frt$id)){
  tranx2[[i]] <- unlist(frt$name[frt$id==i])
}

inspect(as(tranx2,'transactions'))

#   items          
# 1  {apple,        
#   grapes,       
#   mango}        
# 2  {apple,        
#   carrot}       
# 3  {apple,        
#   mango}        
# 4  {apple,        
#   carrot,       
#   grapes}