按提升度和置信度排序规则
Sorting rules by lift and confidence
我正在尝试使用 R 中的 arules 包中的先验函数查找关联规则。
rules <- apriori(data=data, parameter=list(supp=0.001,conf = 0.08),
appearance = list(default="lhs",rhs="YOGHURT"),
control = list(verbose=F))
rules <- sort(rules, decreasing=TRUE,by="confidence")
inspect(rules[1:3])
lhs rhs support confidence lift
1. {A,B} {C} 0.04 0.96 0.25
2. {C,A} {D} 0.05 0.95 0.26
3. {B,D} {A,C} 0.03 0.93 0.24
使用上面显示的代码,我将一些关联规则保存在变量 "rules" 中,并按置信度递减排序。但我想通过信心和电梯同时订购这些规则。我试过了,但出现错误:
rules <- sort(rules, decreasing=TRUE,by=c("confidence","lift"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
有没有办法同时按置信度和提升度对规则进行排序?
假设你有
library(arules)
data("Adult")
rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
那你可以试试
df <- as(rules, "data.frame")
df[order(df$lift, df$confidence), ]
这个我还没想过。加载规则后,您可以将以下代码复制并粘贴到 R 会话中。
setMethod("sort", signature(x = "associations"),
function (x, decreasing = TRUE, na.last = NA, by = "support", ...) {
q <- quality(x)
q <- q[, pmatch(by, colnames(q)), drop = FALSE]
if(is.null(q)) stop("Unknown interest measure to sort by.")
if(length(x) == 0) return(x)
x[do.call(order, c(q, list(na.last = na.last, decreasing = decreasing)))]
})
现在您的原始代码应该可以工作了。
> data("Adult")
> rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
> inspect(head(sort(rules, by=c("supp", "conf"))))
lhs rhs support confidence lift
1 {} => {capital-loss=None} 0.9532779 0.9532779 1.0000000
2 {} => {capital-gain=None} 0.9173867 0.9173867 1.0000000
3 {capital-gain=None} => {capital-loss=None} 0.8706646 0.9490705 0.9955863
4 {capital-loss=None} => {capital-gain=None} 0.8706646 0.9133376 0.9955863
5 {native-country=United-States} => {capital-loss=None} 0.8548380 0.9525461 0.9992323
6 {native-country=United-States} => {capital-gain=None} 0.8219565 0.9159062 0.9983862
这将是下一个版本的 arules 的一部分。
我正在尝试使用 R 中的 arules 包中的先验函数查找关联规则。
rules <- apriori(data=data, parameter=list(supp=0.001,conf = 0.08),
appearance = list(default="lhs",rhs="YOGHURT"),
control = list(verbose=F))
rules <- sort(rules, decreasing=TRUE,by="confidence")
inspect(rules[1:3])
lhs rhs support confidence lift
1. {A,B} {C} 0.04 0.96 0.25
2. {C,A} {D} 0.05 0.95 0.26
3. {B,D} {A,C} 0.03 0.93 0.24
使用上面显示的代码,我将一些关联规则保存在变量 "rules" 中,并按置信度递减排序。但我想通过信心和电梯同时订购这些规则。我试过了,但出现错误:
rules <- sort(rules, decreasing=TRUE,by=c("confidence","lift"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
有没有办法同时按置信度和提升度对规则进行排序?
假设你有
library(arules)
data("Adult")
rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
那你可以试试
df <- as(rules, "data.frame")
df[order(df$lift, df$confidence), ]
这个我还没想过。加载规则后,您可以将以下代码复制并粘贴到 R 会话中。
setMethod("sort", signature(x = "associations"),
function (x, decreasing = TRUE, na.last = NA, by = "support", ...) {
q <- quality(x)
q <- q[, pmatch(by, colnames(q)), drop = FALSE]
if(is.null(q)) stop("Unknown interest measure to sort by.")
if(length(x) == 0) return(x)
x[do.call(order, c(q, list(na.last = na.last, decreasing = decreasing)))]
})
现在您的原始代码应该可以工作了。
> data("Adult")
> rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
> inspect(head(sort(rules, by=c("supp", "conf"))))
lhs rhs support confidence lift
1 {} => {capital-loss=None} 0.9532779 0.9532779 1.0000000
2 {} => {capital-gain=None} 0.9173867 0.9173867 1.0000000
3 {capital-gain=None} => {capital-loss=None} 0.8706646 0.9490705 0.9955863
4 {capital-loss=None} => {capital-gain=None} 0.8706646 0.9133376 0.9955863
5 {native-country=United-States} => {capital-loss=None} 0.8548380 0.9525461 0.9992323
6 {native-country=United-States} => {capital-gain=None} 0.8219565 0.9159062 0.9983862
这将是下一个版本的 arules 的一部分。