如何从嵌套列表中提取特定项目并附加到新列?

How to extract specific items from a nested list and append to new column?

我有一个数据框,其中有一列包含嵌套列表。我正在努力从这些嵌套列表中提取用户名(我对此很陌生)。

虚拟数据:

myNestedList <- list("1" = list('username' = "test",
                              "uninteresting data" = "uninteresting content"),
                     "2" = list('username' = "test2",
                                "uninteresting data" = "uninteresting content"))
Column1 <- c("A","B","C")
column2 <- c("a","b","c")
mydf <- data.frame(Column1, column2)
mydf$nestedlist <- list(myNestedList)

我想提取每一行的所有用户名并将它们附加到一个新列,如果一行有多个用户名,second/third/n-th 用户名应该附加一个分隔的“, ”。 我试过 sapply(mydf$nestedlist, [[, 1) 之类的东西,但这只给了我整个列“nestedlist”的一个列表。

对于上下文:我正在尝试构建有向图以便在 Networkx 或 Gephi 中进一步使用。 column1 中的数据是节点,用户名是提及项,因此是边。如果有另一种方法,无需从嵌套列表中提取用户名,这也是一种解决方案。

在此先感谢您的帮助! :)

如果我们知道嵌套级别,可以使用map_depth

library(purrr)
 mydf$username <- map_depth(mydf$nestedlist, 2, pluck, "username")

-输出

> mydf
  Column1 column2                                                nestedlist    username
1       A       a test, uninteresting content, test2, uninteresting content test, test2
2       B       b test, uninteresting content, test2, uninteresting content test, test2
3       C       c test, uninteresting content, test2, uninteresting content test, test2

或者如果不知道,则使用带有 condition 检查的递归函数来查找 'username'

library(rrapply)
mydf$username <- rrapply(mydf$nestedlist,  
    condition = function(x, .xname) .xname %in% 'username', how = 'prune')
> mydf
  Column1 column2                                                nestedlist    username
1       A       a test, uninteresting content, test2, uninteresting content test, test2
2       B       b test, uninteresting content, test2, uninteresting content test, test2
3       C       c test, uninteresting content, test2, uninteresting content test, test2

如果我们想paste他们,使用

library(stringr)
library(dplyr)
mydf$username <- rrapply(mydf$nestedlist,  
    condition = function(x, .xname) .xname %in% 'username',
          how = 'bind') %>% 
        invoke(str_c, sep=", ", .)
 mydf
  Column1 column2                                                nestedlist    username
1       A       a test, uninteresting content, test2, uninteresting content test, test2
2       B       b test, uninteresting content, test2, uninteresting content test, test2
3       C       c test, uninteresting content, test2, uninteresting content test, test2

-结构

> str(mydf)
'data.frame':   3 obs. of  4 variables:
 $ Column1   : chr  "A" "B" "C"
 $ column2   : chr  "a" "b" "c"
 $ nestedlist:List of 3
  ..$ :List of 2
  .. ..$ 1:List of 2
  .. .. ..$ username          : chr "test"
  .. .. ..$ uninteresting data: chr "uninteresting content"
  .. ..$ 2:List of 2
  .. .. ..$ username          : chr "test2"
  .. .. ..$ uninteresting data: chr "uninteresting content"
  ..$ :List of 2
  .. ..$ 1:List of 2
  .. .. ..$ username          : chr "test"
  .. .. ..$ uninteresting data: chr "uninteresting content"
  .. ..$ 2:List of 2
  .. .. ..$ username          : chr "test2"
  .. .. ..$ uninteresting data: chr "uninteresting content"
  ..$ :List of 2
  .. ..$ 1:List of 2
  .. .. ..$ username          : chr "test"
  .. .. ..$ uninteresting data: chr "uninteresting content"
  .. ..$ 2:List of 2
  .. .. ..$ username          : chr "test2"
  .. .. ..$ uninteresting data: chr "uninteresting content"
 $ username  : chr  "test, test2" "test, test2" "test, test2"