R associations rules with arules package - 如何将规则公式拆分为元素向量？

Question

我有一个 data.frame 对象，我通过以下方式将 class rules 的对象转换为 data.frame 获得的对象：

trx.cpf.rules.df <- as(trx.cpf.rules, "data.frame")

（您可以从 here 指定的结构构建 trx.cpf.rules.df 对象）。

这个数据框的头部是这样的：

> head(trx.cpf.rules.df)
                                                      rules   support confidence     lift
66 {Product_Group_1,Product_Group_49} => {Product_Group_48} 0.1060016  0.7371274 6.683635
12                 {Product_Group_48} => {Product_Group_49} 0.1067810  0.9681979 6.386621
68 {Product_Group_1,Product_Group_23} => {Product_Group_49} 0.1079501  0.9052288 5.971252
16                 {Product_Group_23} => {Product_Group_49} 0.1098987  0.8392857 5.536265
71 {Product_Group_1,Product_Group_23} => {Product_Group_34} 0.1024942  0.8594771 4.702384
19                 {Product_Group_34} => {Product_Group_23} 0.1079501  0.5906183 4.510496

是否有快速方法（专用函数或类似的东西）将每个 trx.cpf.rules.df$rules 转换为两个包含 relue;s 元素的向量？例如，对于第一行，它将是：

> (lhs.el <- c("Product_Group_1", "Product_Group_49"))
[1] "Product_Group_1"  "Product_Group_49"
> (rhs.el <- c("Product_Group_48"))
[1] "Product_Group_48"

Answer 1

这将为您提供具有 lhs/rhs 个向量的 list 结构：

l <- lapply( strsplit(as.character(trx.cpf.rules.df$rules), " => ", fixed = TRUE), function(x) {
  strsplit(  gsub("[{}]", "", x), ",", fixed = TRUE)
})

检查第一条规则：

l[[1]]
# [[1]]
# [1] "Product_Group_1"  "Product_Group_49"
# 
# [[2]]
# [1] "Product_Group_48"

检查所有规则的左侧（头部）：

head(sapply(l, "[", 1))
# [[1]]
# [1] "Product_Group_1"  "Product_Group_49"
# 
# [[2]]
# [1] "Product_Group_48"
# 
# [[3]]
# [1] "Product_Group_1"  "Product_Group_23"
# 
# [[4]]
# [1] "Product_Group_23"
# 
# [[5]]
# [1] "Product_Group_1"  "Product_Group_23"
# 
# [[6]]
# [1] "Product_Group_34"

R associations rules with arules package - 如何将规则公式拆分为元素向量？

R associations rules with arules package - how to split rule formula into vector of elements?

regex

r

associations

dataframe

arules