将字符向量转换为 arules 的事务
convert character vector to transactions for arules
请帮助将购物商品的字符向量转换为 "transactions" 规则。原始数据是这样的:
shopping_items <- c("apple banana", "orange", "tea orange beef")
向量中的每个元素代表单笔交易中购买的物品,物品之间用space " "分隔,例如交易1包括两个物品,苹果和香蕉。我如何将其转换为 "transactions" 类型以便我可以在 arules 中使用它?
提前致谢!
实施可能不是最佳的,但您可以尝试改进它。
library(stringi)
library(arules)
library(purrr)
shopping_items <- c("apple banana", "orange", "tea orange beef")
str <- paste(shopping_items,collapse = ' ')
# unique items
str_un <- unique(unlist(stri_split_fixed(str,' ')))
# create a dataframe with dimensions:
# length(shopping_items) x length(str_un)
df <- as.data.frame(matrix(rep(0,length(str_un)*length(shopping_items )),ncol=length(str_un)))
names(df) <- str_un
# positions of 1's in each column
vecs <- map(str_un,grep,shopping_items)
sapply(1:length(str_un), function(x) df[,x][vecs[[x]]] <<- 1)
df[] <- lapply(df,as.factor)
# Generate a transactions dataset.
tr <- as(df, "transactions")
# Generate the association rules.
# rules <- apriori(tr, ...
这是简短的版本:
library(arules)
shopping_items <- c("apple banana", "orange", "tea orange beef")
trans <- as(strsplit(shopping_items, " "), "transactions")
inspect(trans)
items
[1] {apple,banana}
[2] {orange}
[3] {beef,orange,tea}
请帮助将购物商品的字符向量转换为 "transactions" 规则。原始数据是这样的:
shopping_items <- c("apple banana", "orange", "tea orange beef")
向量中的每个元素代表单笔交易中购买的物品,物品之间用space " "分隔,例如交易1包括两个物品,苹果和香蕉。我如何将其转换为 "transactions" 类型以便我可以在 arules 中使用它?
提前致谢!
实施可能不是最佳的,但您可以尝试改进它。
library(stringi)
library(arules)
library(purrr)
shopping_items <- c("apple banana", "orange", "tea orange beef")
str <- paste(shopping_items,collapse = ' ')
# unique items
str_un <- unique(unlist(stri_split_fixed(str,' ')))
# create a dataframe with dimensions:
# length(shopping_items) x length(str_un)
df <- as.data.frame(matrix(rep(0,length(str_un)*length(shopping_items )),ncol=length(str_un)))
names(df) <- str_un
# positions of 1's in each column
vecs <- map(str_un,grep,shopping_items)
sapply(1:length(str_un), function(x) df[,x][vecs[[x]]] <<- 1)
df[] <- lapply(df,as.factor)
# Generate a transactions dataset.
tr <- as(df, "transactions")
# Generate the association rules.
# rules <- apriori(tr, ...
这是简短的版本:
library(arules)
shopping_items <- c("apple banana", "orange", "tea orange beef")
trans <- as(strsplit(shopping_items, " "), "transactions")
inspect(trans)
items
[1] {apple,banana}
[2] {orange}
[3] {beef,orange,tea}