如何从嵌套列表中提取特定项目并附加到新列?
How to extract specific items from a nested list and append to new column?
我有一个数据框,其中有一列包含嵌套列表。我正在努力从这些嵌套列表中提取用户名(我对此很陌生)。
虚拟数据:
myNestedList <- list("1" = list('username' = "test",
"uninteresting data" = "uninteresting content"),
"2" = list('username' = "test2",
"uninteresting data" = "uninteresting content"))
Column1 <- c("A","B","C")
column2 <- c("a","b","c")
mydf <- data.frame(Column1, column2)
mydf$nestedlist <- list(myNestedList)
我想提取每一行的所有用户名并将它们附加到一个新列,如果一行有多个用户名,second/third/n-th 用户名应该附加一个分隔的“, ”。
我试过 sapply(mydf$nestedlist,
[[, 1)
之类的东西,但这只给了我整个列“nestedlist”的一个列表。
对于上下文:我正在尝试构建有向图以便在 Networkx 或 Gephi 中进一步使用。 column1 中的数据是节点,用户名是提及项,因此是边。如果有另一种方法,无需从嵌套列表中提取用户名,这也是一种解决方案。
在此先感谢您的帮助! :)
如果我们知道嵌套级别,可以使用map_depth
library(purrr)
mydf$username <- map_depth(mydf$nestedlist, 2, pluck, "username")
-输出
> mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
或者如果不知道,则使用带有 condition
检查的递归函数来查找 'username'
library(rrapply)
mydf$username <- rrapply(mydf$nestedlist,
condition = function(x, .xname) .xname %in% 'username', how = 'prune')
> mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
如果我们想paste
他们,使用
library(stringr)
library(dplyr)
mydf$username <- rrapply(mydf$nestedlist,
condition = function(x, .xname) .xname %in% 'username',
how = 'bind') %>%
invoke(str_c, sep=", ", .)
mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
-结构
> str(mydf)
'data.frame': 3 obs. of 4 variables:
$ Column1 : chr "A" "B" "C"
$ column2 : chr "a" "b" "c"
$ nestedlist:List of 3
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
$ username : chr "test, test2" "test, test2" "test, test2"
我有一个数据框,其中有一列包含嵌套列表。我正在努力从这些嵌套列表中提取用户名(我对此很陌生)。
虚拟数据:
myNestedList <- list("1" = list('username' = "test",
"uninteresting data" = "uninteresting content"),
"2" = list('username' = "test2",
"uninteresting data" = "uninteresting content"))
Column1 <- c("A","B","C")
column2 <- c("a","b","c")
mydf <- data.frame(Column1, column2)
mydf$nestedlist <- list(myNestedList)
我想提取每一行的所有用户名并将它们附加到一个新列,如果一行有多个用户名,second/third/n-th 用户名应该附加一个分隔的“, ”。
我试过 sapply(mydf$nestedlist,
[[, 1)
之类的东西,但这只给了我整个列“nestedlist”的一个列表。
对于上下文:我正在尝试构建有向图以便在 Networkx 或 Gephi 中进一步使用。 column1 中的数据是节点,用户名是提及项,因此是边。如果有另一种方法,无需从嵌套列表中提取用户名,这也是一种解决方案。
在此先感谢您的帮助! :)
如果我们知道嵌套级别,可以使用map_depth
library(purrr)
mydf$username <- map_depth(mydf$nestedlist, 2, pluck, "username")
-输出
> mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
或者如果不知道,则使用带有 condition
检查的递归函数来查找 'username'
library(rrapply)
mydf$username <- rrapply(mydf$nestedlist,
condition = function(x, .xname) .xname %in% 'username', how = 'prune')
> mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
如果我们想paste
他们,使用
library(stringr)
library(dplyr)
mydf$username <- rrapply(mydf$nestedlist,
condition = function(x, .xname) .xname %in% 'username',
how = 'bind') %>%
invoke(str_c, sep=", ", .)
mydf
Column1 column2 nestedlist username
1 A a test, uninteresting content, test2, uninteresting content test, test2
2 B b test, uninteresting content, test2, uninteresting content test, test2
3 C c test, uninteresting content, test2, uninteresting content test, test2
-结构
> str(mydf)
'data.frame': 3 obs. of 4 variables:
$ Column1 : chr "A" "B" "C"
$ column2 : chr "a" "b" "c"
$ nestedlist:List of 3
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
..$ :List of 2
.. ..$ 1:List of 2
.. .. ..$ username : chr "test"
.. .. ..$ uninteresting data: chr "uninteresting content"
.. ..$ 2:List of 2
.. .. ..$ username : chr "test2"
.. .. ..$ uninteresting data: chr "uninteresting content"
$ username : chr "test, test2" "test, test2" "test, test2"