R - apriori() 无法从数字交易中识别 lhs

R - apriori() not recognising lhs from numerical transaction

我在获取数据以使用 arules 包生成任何规则时遇到了真正的麻烦。我已经设法获得了 100000 行交易数据,并且在 SAS 中显示了规则。我无法让它在 R 中工作。

[5]      {19,29,40,119,134}   
[6]      {24,40,45,67,141}    
[7]      {17,18,57,74,412}    
[8]      {16,79,90,150,498}   
[9]      {18,57,111,161,267}  
[10]     {11,75,131,427,429}  
[11]     {57,99,111,143,236} 

交易数据看起来像这样,最初来自 table,其中所有数字都是分开的。

arules <- read.transactions('tid.csv', format = c("basket", "single"), 
sep=",")
rules <- apriori(arules,parameter = list(supp = 0.1, conf = 0.1, target = 
"rules"))
summary(rules)

供参考,支持度和置信度设置没有区别。有时我在检查规则时会得到这个。

         lhs    rhs                   support      confidence   lift count
[1]      {}  => {8,11,96,112,432}     9.710623e-06 9.710623e-06 1    1    
[2]      {}  => {62,134,222,254,412}  9.710623e-06 9.710623e-06 1    1 

知道为什么 apriori 不能分离交易中的项目吗?这是否需要重新转换为长格式,如果需要,我将如何形成此数据框?

V2  V3  V4  V5  V6
8   11  96  112 432
10  35  39  76  119
18  38  68  141 267
29  36  57  61  63
19  29  40  119 134
24  40  45  67  141
17  18  57  74  412

如果我对你的理解正确,那么你应该试试这个,如果有帮助请告诉我们。

library(arules)
library(arulesViz)

#sample data
df <- read.table(text="V2  V3  V4  V5  V6
                 8   11  96  112 432
                 10  35  39  76  119
                 18  38  68  141 267
                 29  36  57  61  63
                 19  29  40  119 134
                 24  40  45  67  141
                 17  18  57  74  412", header=T)
write.csv(df, "apriori_demo.csv", row.names = F)

#convert sample data into transactions format for apriori algorithm
trx <- read.transactions("apriori_demo.csv", format="basket", sep=",", skip=1)

#apriori rules
apriori_rule <- apriori(trx, parameter = list(supp = 0.1, conf = 0.1)) 
#obviously you need to have better parameters compared to the one you have used in your post!
inspect(apriori_rule)
plot(apriori_rule, method="graph")