正确地将 "data.frame" 转换为 "transactions" for arules
Correctly convert "data.frame" to "transactions" for arules
我有以下 data.frame:
> str(noticias_json, list.len = 10)
'data.frame': 1771 obs. of 3 variables:
$ bairro:List of 1771
..$ : chr "icarai"
..$ : chr "nacoes"
..$ : chr "danilo passos" "serra verde"
..$ : chr "icarai"
..$ : chr "centro"
..$ : chr "itai" "manoel valinhas"
..$ : chr "anchieta"
..$ : chr "liberdade"
..$ : chr "nossa senhora das gracas"
..$ : chr "liberdade"
.. [list output truncated]
$ crime :List of 1771
..$ : chr "trafico de drogas"
..$ : chr "roubo de veiculo"
..$ : chr "roubo"
..$ : chr "trafico de drogas"
..$ : chr "falsidade ideologica"
..$ : chr "trafico de drogas" "porte ilegal de armas" "roubo"
..$ : chr "trafico de drogas" "porte ilegal de armas"
..$ : chr "homicidio" "trafico de drogas" "porte ilegal de armas" "ocultacao de cadaver" ...
..$ : chr "trafico de drogas" "roubo"
..$ : chr "homicidio" "trafico de drogas" "porte ilegal de armas" "estupro"
.. [list output truncated]
$ data : chr "01-02-2016" "31-02-2016" "01-02-2017" "01-02-2017" ...
我需要为包 "arules" 准备它,以便我可以使用函数 apriori()。我试过:
df_fact <- as.data.frame(unlist(noticias_json))
然后:
df_trans <- as(df_fact, "transactions")
但是如果我尝试检查,我会得到以下输出
> inspect(df_trans[1:5])
items transactionID
[1] {unlist(noticias_json)=icarai} bairro1
[2] {unlist(noticias_json)=nacoes} bairro2
[3] {unlist(noticias_json)=danilo passos} bairro3
[4] {unlist(noticias_json)=serra verde} bairro4
[5] {unlist(noticias_json)=icarai} bairro5
与 Class 的杂货相比,arules 完全不同
<pre>
> inspect(Groceries[1:5])
items
[1] {citrus fruit,semi-finished bread,margarine,ready soups}
[2] {tropical fruit,yogurt,coffee}
[3] {whole milk}
[4] {pip fruit,yogurt,cream cheese ,meat spreads}
[5] {other vegetables,whole milk,condensed milk,long life bakery product}
我不知道我哪里做错了。如果有人可以帮助我,我将非常感激。
提前致谢。
我们可能需要 split
'data' 列并执行 unlist
df_trans <- as(setNames(lapply(split(noticias_json[-3],
noticias_json$data), unlist), NULL), "transactions")
inspect(df_trans)
# items
#[1] {icarai,
# trafico de drogas}
#[2] {danilo passos,
# porte ilegal de armas,
# roubo,
# serra verde,
# trafico de drogas}
数据
noticias_json <- structure(list(bairro = structure(list("icarai",
c("danilo passos",
"serra verde")), class = "AsIs"), crime = structure(list("trafico de drogas",
c("trafico de drogas", "porte ilegal de armas", "roubo")), class = "AsIs"),
data = c("01-02-2016", "31-02-2016")), .Names = c("bairro",
"crime", "data"), row.names = c(NA, -2L), class = "data.frame")
我有以下 data.frame:
> str(noticias_json, list.len = 10) 'data.frame': 1771 obs. of 3 variables: $ bairro:List of 1771 ..$ : chr "icarai" ..$ : chr "nacoes" ..$ : chr "danilo passos" "serra verde" ..$ : chr "icarai" ..$ : chr "centro" ..$ : chr "itai" "manoel valinhas" ..$ : chr "anchieta" ..$ : chr "liberdade" ..$ : chr "nossa senhora das gracas" ..$ : chr "liberdade" .. [list output truncated] $ crime :List of 1771 ..$ : chr "trafico de drogas" ..$ : chr "roubo de veiculo" ..$ : chr "roubo" ..$ : chr "trafico de drogas" ..$ : chr "falsidade ideologica" ..$ : chr "trafico de drogas" "porte ilegal de armas" "roubo" ..$ : chr "trafico de drogas" "porte ilegal de armas" ..$ : chr "homicidio" "trafico de drogas" "porte ilegal de armas" "ocultacao de cadaver" ... ..$ : chr "trafico de drogas" "roubo" ..$ : chr "homicidio" "trafico de drogas" "porte ilegal de armas" "estupro" .. [list output truncated] $ data : chr "01-02-2016" "31-02-2016" "01-02-2017" "01-02-2017" ...
我需要为包 "arules" 准备它,以便我可以使用函数 apriori()。我试过:
df_fact <- as.data.frame(unlist(noticias_json))
然后:
df_trans <- as(df_fact, "transactions")
但是如果我尝试检查,我会得到以下输出
> inspect(df_trans[1:5]) items transactionID [1] {unlist(noticias_json)=icarai} bairro1 [2] {unlist(noticias_json)=nacoes} bairro2 [3] {unlist(noticias_json)=danilo passos} bairro3 [4] {unlist(noticias_json)=serra verde} bairro4 [5] {unlist(noticias_json)=icarai} bairro5
与 Class 的杂货相比,arules 完全不同
<pre>
> inspect(Groceries[1:5])
items
[1] {citrus fruit,semi-finished bread,margarine,ready soups}
[2] {tropical fruit,yogurt,coffee}
[3] {whole milk}
[4] {pip fruit,yogurt,cream cheese ,meat spreads}
[5] {other vegetables,whole milk,condensed milk,long life bakery product}
我不知道我哪里做错了。如果有人可以帮助我,我将非常感激。 提前致谢。
我们可能需要 split
'data' 列并执行 unlist
df_trans <- as(setNames(lapply(split(noticias_json[-3],
noticias_json$data), unlist), NULL), "transactions")
inspect(df_trans)
# items
#[1] {icarai,
# trafico de drogas}
#[2] {danilo passos,
# porte ilegal de armas,
# roubo,
# serra verde,
# trafico de drogas}
数据
noticias_json <- structure(list(bairro = structure(list("icarai",
c("danilo passos",
"serra verde")), class = "AsIs"), crime = structure(list("trafico de drogas",
c("trafico de drogas", "porte ilegal de armas", "roubo")), class = "AsIs"),
data = c("01-02-2016", "31-02-2016")), .Names = c("bairro",
"crime", "data"), row.names = c(NA, -2L), class = "data.frame")