将边列表转换为 arules 事务稀疏邻接矩阵
Convert edge list to a arules transaction sparse adjacency matrix
我有边缘形式的交易数据,我需要创建一个可以与 arules R 包一起使用的基于交易的稀疏矩阵。目前我正在使用 tidyr 包中的 "spread" 将边列表转换为矩阵,每行作为基于 "basket ID. Then I after converting it to a logical since I can't use quantity information with arules I convert it the "transaction" 的数据类型。请参阅下面的 R 代码示例。
我的问题是,这适用于小集 basket/transactions,但是当我有更多集时,由于 "spread" 函数,它会导致内存问题。我想知道是否有更 memory/resource 有效的方法将原始边缘视图转换为 arules 使用的事务数据类型?提前感谢您的任何建议!
## Load libraries
library(tidyr)
library(arules)
## Create an example of the transactions that I am analizing
TransEdgeList = data.frame(BasketID=c(1,1,2,2,3,3,3),
Item=c(10,11,10,12,10,11,13),
Qty=c(1,1,2,3,1,2,1))
#convert to something that arules can transform
BasketDataFrame = spread(TransEdgeList, Item, Qty)
#convert to logical
BasketDataFrame[, 2:dim(BasketDataFrame)[2]]=
!is.na(BasketDataFrame[, 2:dim(BasketDataFrame)[2]])
#convert to a transaction sparse matrix that arules can use
BasketMatrix = as(BasketDataFrame[, 2:dim(BasketDataFrame)[2]], "transactions")
BasketMatrix
我会手动构建一个稀疏逻辑三元组矩阵 (ngTMatrix),将其转换为稀疏 ngCMatrix,然后再将其转换为交易对象。这样就不会创建一个完整的矩阵表示,你应该在记忆方面很好。
library(arules)
library(Matrix)
TransEdgeList <- data.frame(BasketID=c(1,1,2,2,3,3,3),
Item=c(10,11,10,12,10,11,13),
Qty=c(1,1,2,3,1,2,1))
m <- new("ngTMatrix",
i = as.integer(TransEdgeList$Item)-1L,
j = as.integer(TransEdgeList$BasketID)-1L,
Dim = as.integer(c(max(TransEdgeList$Item), max(TransEdgeList$BasketID))))
m <- as(m, "ngCMatrix")
tr <- as(m, "transactions")
inspect(tr)
items itemsetID
[1] {10,11} 1
[2] {10,12} 2
[3] {10,11,13} 3
我有边缘形式的交易数据,我需要创建一个可以与 arules R 包一起使用的基于交易的稀疏矩阵。目前我正在使用 tidyr 包中的 "spread" 将边列表转换为矩阵,每行作为基于 "basket ID. Then I after converting it to a logical since I can't use quantity information with arules I convert it the "transaction" 的数据类型。请参阅下面的 R 代码示例。
我的问题是,这适用于小集 basket/transactions,但是当我有更多集时,由于 "spread" 函数,它会导致内存问题。我想知道是否有更 memory/resource 有效的方法将原始边缘视图转换为 arules 使用的事务数据类型?提前感谢您的任何建议!
## Load libraries
library(tidyr)
library(arules)
## Create an example of the transactions that I am analizing
TransEdgeList = data.frame(BasketID=c(1,1,2,2,3,3,3),
Item=c(10,11,10,12,10,11,13),
Qty=c(1,1,2,3,1,2,1))
#convert to something that arules can transform
BasketDataFrame = spread(TransEdgeList, Item, Qty)
#convert to logical
BasketDataFrame[, 2:dim(BasketDataFrame)[2]]=
!is.na(BasketDataFrame[, 2:dim(BasketDataFrame)[2]])
#convert to a transaction sparse matrix that arules can use
BasketMatrix = as(BasketDataFrame[, 2:dim(BasketDataFrame)[2]], "transactions")
BasketMatrix
我会手动构建一个稀疏逻辑三元组矩阵 (ngTMatrix),将其转换为稀疏 ngCMatrix,然后再将其转换为交易对象。这样就不会创建一个完整的矩阵表示,你应该在记忆方面很好。
library(arules)
library(Matrix)
TransEdgeList <- data.frame(BasketID=c(1,1,2,2,3,3,3),
Item=c(10,11,10,12,10,11,13),
Qty=c(1,1,2,3,1,2,1))
m <- new("ngTMatrix",
i = as.integer(TransEdgeList$Item)-1L,
j = as.integer(TransEdgeList$BasketID)-1L,
Dim = as.integer(c(max(TransEdgeList$Item), max(TransEdgeList$BasketID))))
m <- as(m, "ngCMatrix")
tr <- as(m, "transactions")
inspect(tr)
items itemsetID
[1] {10,11} 1
[2] {10,12} 2
[3] {10,11,13} 3