关联分析中提取特殊规则
Extracting special rules in association analysis
如何提取lhs
只有一个特殊项目出现的规则
1 {231050} => {231051} 0.06063479 1.0000000 16.492183
2 {231050,231051} => {275001} 0.05490568 0.9055145 6.576661
3 {231050,275001} => {231051} 0.05490568 1.0000000 16.492183
我只想提取第一行,其中我只有一个 231050
试试这个(假设规则是使用先验生成的):
df <- as(rules, 'data.frame')
df$rules <- as.character(df$rules)
lhs <- do.call(rbind, strsplit(df$rules, split='=>'))[,1]
lhs.items <- strsplit(lhs, split=',')
indices <- which(lapply(lhs.items, length) == 1)
special.item <- '231050'
special.indices <- which(grepl(special.item, lhs.items[[indices]]))
selected.rules <- df[special.indices,]
selected.rules
rules support confidence lift
1 {231050}=>{231051} 0.06063479 1 16.49218
arules
有一个 subset
函数(参见 ?arules::subset
),您可以使用它来绘制满足您的标准的规则子集 - 例如 lhs 上的特定项目、最低支持等:
library(arules)
data("Adult")
rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, minlen = 2))
item <- "race=White"
rules.sub <- subset(rules, lhs %in% item & size(lhs)==1)
inspect(rules.sub)
# lhs rhs support confidence lift
# 7 {race=White} => {native-country=United-States} 0.7881127 0.9217231 1.0270761
# 8 {race=White} => {capital-gain=None} 0.7817862 0.9143240 0.9966616
# 9 {race=White} => {capital-loss=None} 0.8136849 0.9516307 0.9982720
如何提取lhs
只有一个特殊项目出现的规则
1 {231050} => {231051} 0.06063479 1.0000000 16.492183
2 {231050,231051} => {275001} 0.05490568 0.9055145 6.576661
3 {231050,275001} => {231051} 0.05490568 1.0000000 16.492183
我只想提取第一行,其中我只有一个 231050
试试这个(假设规则是使用先验生成的):
df <- as(rules, 'data.frame')
df$rules <- as.character(df$rules)
lhs <- do.call(rbind, strsplit(df$rules, split='=>'))[,1]
lhs.items <- strsplit(lhs, split=',')
indices <- which(lapply(lhs.items, length) == 1)
special.item <- '231050'
special.indices <- which(grepl(special.item, lhs.items[[indices]]))
selected.rules <- df[special.indices,]
selected.rules
rules support confidence lift
1 {231050}=>{231051} 0.06063479 1 16.49218
arules
有一个 subset
函数(参见 ?arules::subset
),您可以使用它来绘制满足您的标准的规则子集 - 例如 lhs 上的特定项目、最低支持等:
library(arules)
data("Adult")
rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, minlen = 2))
item <- "race=White"
rules.sub <- subset(rules, lhs %in% item & size(lhs)==1)
inspect(rules.sub)
# lhs rhs support confidence lift
# 7 {race=White} => {native-country=United-States} 0.7881127 0.9217231 1.0270761
# 8 {race=White} => {capital-gain=None} 0.7817862 0.9143240 0.9966616
# 9 {race=White} => {capital-loss=None} 0.8136849 0.9516307 0.9982720