arules package - Error: subscript out of bounds for producing recommendations
arules package - Error: subscript out of bounds for producing recommendations
我正在尝试使用 arules
套餐
进行推荐
我有这个数据
Data
Client product N Date
1 A Banana 1 01/01/2016
2 A Tomato 1 01/01/2016
3 A Tuna 1 01/01/2016
4 B Orange 2 01/01/2016
5 B Tomato 3 02/01/2016
6 C Kiwi 11 08/01/2016
接下来我使用了这段代码
trans = as(split(Data$product, Data$Client), "transactions")
Sales<- as(trans, "data.frame")
rules = apriori(trans, parameter = list(support = 0.001, confidence = 0.005))
rules.sorted <- sort(rules, by="lift")
# find redundant rules
subset.matrix <- is.subset(rules.sorted, rules.sorted)
subset.matrix[lower.tri(subset.matrix, diag=T)] <- NA
redundant <- colSums(subset.matrix, na.rm=T) >= 1
which(redundant)
rules.pruned <- rules.sorted[!redundant]
inspect(rules.pruned)
rules = rules.pruned
我得到这些规则:
lhs rhs support confidence lift
1 {Tuna} => {Banana} 0.3333333 1.0000000 3.0
2 {Orange} => {Tomato} 0.3333333 1.0000000 1.5
3 {Tuna} => {Tomato} 0.3333333 1.0000000 1.5
4 {Banana} => {Tomato} 0.3333333 1.0000000 1.5
5 {} => {Kiwi} 0.3333333 0.3333333 1.0
6 {} => {Orange} 0.3333333 0.3333333 1.0
7 {} => {Tuna} 0.3333333 0.3333333 1.0
8 {} => {Banana} 0.3333333 0.3333333 1.0
9 {} => {Tomato} 0.6666667 0.6666667 1.0
但是现在,我想为所有客户推荐 3 个产品:
for (i in 1:3) {
reco=function(x){
rulesMatchLHS = is.subset(rules@lhs,x)
suitableRules = rulesMatchLHS & !(is.subset(rules@rhs,x))
order.rules = sort(rules[suitableRules], by = "lift")
LIST(order.rules@rhs)[[i]]
}
NewS <- sapply(1:length(trans), function(x) reco(trans[x]))
NewS <- as.data.frame(NewS)
Sales <-cbind(Sales,NewS)
}
此代码产生错误
Error in LIST(order.rules@rhs)[[i]] : subscript out of bounds
我认为发生这种情况是因为我没有为所有用户提供推荐,但我希望代码继续并在这种情况下放置 "no suggestion"。
最好的方法是什么?
我想你想要这样的代码。
读取数据和挖矿规则:
library(arules)
Data <- structure(list(Client = structure(c(1L, 1L, 1L, 2L, 2L, 3L), .Label = c("A", "B", "C"), class = "factor"), product = structure(c(1L, 4L, 5L, 3L, 4L, 2L), .Label = c("Banana", "Kiwi", "Orange", "Tomato", "Tuna"), class = "factor"), N = c(1L, 1L, 1L, 2L, 3L, 11L), Date = structure(c(1L, 1L, 1L, 1L, 2L, 3L), .Label = c("01/01/2016", "02/01/2016", "08/01/2016"), class = "factor")), .Names = c("Client", "product", "N", "Date"), class = "data.frame", row.names = c(NA, -6L))
trans <- as(split(Data$product, Data$Client), "transactions")
rules <- apriori(trans, parameter = list(support = 0.001, confidence = 0.5, maxlen = 2))
inspect(rules)
输出:
lhs rhs support confidence lift
1 {} => {Tomato} 0.6666667 0.6666667 1.0
2 {Orange} => {Tomato} 0.3333333 1.0000000 1.5
3 {Tomato} => {Orange} 0.3333333 0.5000000 1.5
4 {Tuna} => {Banana} 0.3333333 1.0000000 3.0
5 {Banana} => {Tuna} 0.3333333 1.0000000 3.0
6 {Tuna} => {Tomato} 0.3333333 1.0000000 1.5
7 {Tomato} => {Tuna} 0.3333333 0.5000000 1.5
8 {Banana} => {Tomato} 0.3333333 1.0000000 1.5
9 {Tomato} => {Banana} 0.3333333 0.5000000 1.5
创建推荐:
reco <- function(rules, newTrans){
rules.sorted <- sort(rules, by="lift")
rhs_labels <- unlist(as(rhs(rules.sorted), "list"))
matches <- is.subset(lhs(rules.sorted), newTrans) &
!(is.subset(rhs(rules.sorted), newTrans))
apply(matches, MARGIN = 2, FUN = function(x) unique(rhs_labels[x]))
}
reco(rules, trans)
三笔交易(即客户)的输出:
$`{Banana,Tomato,Tuna}`
[1] "Orange"
$`{Orange,Tomato}`
[1] "Tuna" "Banana"
$`{Kiwi}`
[1] "Tomato"
一些注意事项:
- 我只挖掘长度为1和2的规则,这样效率更高,不需要再去寻找冗余规则
- 我增加了信心。
- 包 recommenderlab 将使用方法 "AR" 进行此类推荐。这目前无法正常工作,但很快就会正常工作。
我正在尝试使用 arules
套餐
我有这个数据
Data
Client product N Date
1 A Banana 1 01/01/2016
2 A Tomato 1 01/01/2016
3 A Tuna 1 01/01/2016
4 B Orange 2 01/01/2016
5 B Tomato 3 02/01/2016
6 C Kiwi 11 08/01/2016
接下来我使用了这段代码
trans = as(split(Data$product, Data$Client), "transactions")
Sales<- as(trans, "data.frame")
rules = apriori(trans, parameter = list(support = 0.001, confidence = 0.005))
rules.sorted <- sort(rules, by="lift")
# find redundant rules
subset.matrix <- is.subset(rules.sorted, rules.sorted)
subset.matrix[lower.tri(subset.matrix, diag=T)] <- NA
redundant <- colSums(subset.matrix, na.rm=T) >= 1
which(redundant)
rules.pruned <- rules.sorted[!redundant]
inspect(rules.pruned)
rules = rules.pruned
我得到这些规则:
lhs rhs support confidence lift
1 {Tuna} => {Banana} 0.3333333 1.0000000 3.0
2 {Orange} => {Tomato} 0.3333333 1.0000000 1.5
3 {Tuna} => {Tomato} 0.3333333 1.0000000 1.5
4 {Banana} => {Tomato} 0.3333333 1.0000000 1.5
5 {} => {Kiwi} 0.3333333 0.3333333 1.0
6 {} => {Orange} 0.3333333 0.3333333 1.0
7 {} => {Tuna} 0.3333333 0.3333333 1.0
8 {} => {Banana} 0.3333333 0.3333333 1.0
9 {} => {Tomato} 0.6666667 0.6666667 1.0
但是现在,我想为所有客户推荐 3 个产品:
for (i in 1:3) {
reco=function(x){
rulesMatchLHS = is.subset(rules@lhs,x)
suitableRules = rulesMatchLHS & !(is.subset(rules@rhs,x))
order.rules = sort(rules[suitableRules], by = "lift")
LIST(order.rules@rhs)[[i]]
}
NewS <- sapply(1:length(trans), function(x) reco(trans[x]))
NewS <- as.data.frame(NewS)
Sales <-cbind(Sales,NewS)
}
此代码产生错误
Error in LIST(order.rules@rhs)[[i]] : subscript out of bounds
我认为发生这种情况是因为我没有为所有用户提供推荐,但我希望代码继续并在这种情况下放置 "no suggestion"。
最好的方法是什么?
我想你想要这样的代码。
读取数据和挖矿规则:
library(arules)
Data <- structure(list(Client = structure(c(1L, 1L, 1L, 2L, 2L, 3L), .Label = c("A", "B", "C"), class = "factor"), product = structure(c(1L, 4L, 5L, 3L, 4L, 2L), .Label = c("Banana", "Kiwi", "Orange", "Tomato", "Tuna"), class = "factor"), N = c(1L, 1L, 1L, 2L, 3L, 11L), Date = structure(c(1L, 1L, 1L, 1L, 2L, 3L), .Label = c("01/01/2016", "02/01/2016", "08/01/2016"), class = "factor")), .Names = c("Client", "product", "N", "Date"), class = "data.frame", row.names = c(NA, -6L))
trans <- as(split(Data$product, Data$Client), "transactions")
rules <- apriori(trans, parameter = list(support = 0.001, confidence = 0.5, maxlen = 2))
inspect(rules)
输出:
lhs rhs support confidence lift
1 {} => {Tomato} 0.6666667 0.6666667 1.0
2 {Orange} => {Tomato} 0.3333333 1.0000000 1.5
3 {Tomato} => {Orange} 0.3333333 0.5000000 1.5
4 {Tuna} => {Banana} 0.3333333 1.0000000 3.0
5 {Banana} => {Tuna} 0.3333333 1.0000000 3.0
6 {Tuna} => {Tomato} 0.3333333 1.0000000 1.5
7 {Tomato} => {Tuna} 0.3333333 0.5000000 1.5
8 {Banana} => {Tomato} 0.3333333 1.0000000 1.5
9 {Tomato} => {Banana} 0.3333333 0.5000000 1.5
创建推荐:
reco <- function(rules, newTrans){
rules.sorted <- sort(rules, by="lift")
rhs_labels <- unlist(as(rhs(rules.sorted), "list"))
matches <- is.subset(lhs(rules.sorted), newTrans) &
!(is.subset(rhs(rules.sorted), newTrans))
apply(matches, MARGIN = 2, FUN = function(x) unique(rhs_labels[x]))
}
reco(rules, trans)
三笔交易(即客户)的输出:
$`{Banana,Tomato,Tuna}`
[1] "Orange"
$`{Orange,Tomato}`
[1] "Tuna" "Banana"
$`{Kiwi}`
[1] "Tomato"
一些注意事项:
- 我只挖掘长度为1和2的规则,这样效率更高,不需要再去寻找冗余规则
- 我增加了信心。
- 包 recommenderlab 将使用方法 "AR" 进行此类推荐。这目前无法正常工作,但很快就会正常工作。