Tibble:对列表列的操作

Tibble: operation on list columns

我有以下问题:

temp <- structure(list(x = list(1:10, 1:10), y = list(c(3L, 9L, 10L, 
8L, 1L), c(1L, 3L, 5L, 2L, 4L))), .Names = c("x", "y"), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -2L))


> temp
# A tibble: 2 x 2
           x         y
      <list>    <list>
1 <int [10]> <int [5]>
2 <int [10]> <int [5]>

我想创建一个新列 z,它是列 xy 中列表元素的 setdiff,这样 temp$z 应该输出为:

> temp$z
[[1]]
[1] 2 4 5 6 7

[[2]]
[1]  6  7  8  9 10

温度将更新为:

> temp
# A tibble: 2 x 3
           x         y         z
      <list>    <list>    <list>
1 <int [10]> <int [5]> <int [5]>
2 <int [10]> <int [5]> <int [5]>

PS:dplyr 解决方案会很棒! :-)

您可以在 mutate 中使用 Map:

temp %>% mutate(z=Map(setdiff, x, y))
# # A tibble: 2 x 3
#            x         y         z
#       <list>    <list>    <list>
# 1 <int [10]> <int [5]> <int [5]>
# 2 <int [10]> <int [5]> <int [5]>

temp %>% mutate(z=Map(setdiff, x, y)) %>% pull(z)
# [[1]]
# [1] 2 4 5 6 7
# 
# [[2]]
# [1]  6  7  8  9 10

您可以在 mutate 中使用 purrr::map2


library(dplyr)
library(purrr)

temp %>% mutate(z = map2(x, y, setdiff))

#> # A tibble: 2 x 3
#>            x         y         z
#>       <list>    <list>    <list>
#> 1 <int [10]> <int [5]> <int [5]>
#> 2 <int [10]> <int [5]> <int [5]>

或者我们在做的时候只是基地:)

within(temp,z<-Map(setdiff, x, y))