将嵌套(3 级)列表转换为 long/tall 格式数据框
Converting a nested (3-level) list to a long/tall format data frame
我有一个包含 3 层的嵌套列表:
m = list(try1 = list(list(court = c("jack", "queen", "king"),
suit = list(diamonds = 2, clubs = 5)),
list(court = c("jack", "queen", "king"),
suit = list(diamonds = 45, clubs = 67))),
try2 = list(list(court = c("jack", "queen", "king"),
suit = list(diamonds = 400, clubs = 300)),
list(court = c("jack", "queen", "king"),
suit = list(diamonds = 5000, clubs = 6000))))
> str(m)
List of 2
$ try1:List of 2
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 2
.. .. ..$ clubs : num 5
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 45
.. .. ..$ clubs : num 67
$ try2:List of 2
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 400
.. .. ..$ clubs : num 300
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 5000
.. .. ..$ clubs : num 6000
对于 try1
和 try2
中的每个子列表,我需要提取 suit
子列表并重新绑定其元素,以便生成的数据框为具有 4 列的长格式- value
(花色的值),suit
(标识值来自哪个花色,即方块或梅花),iter
(标识花色属于哪个子列表,即 1 或 2) 和 try
(try1 或 try2)。
我可以结合使用 expand.grid()
和 mapply()
:
grd = expand.grid(try = names(m), iter = 1:2, suit = c("diamonds", "clubs"))
grd$value = mapply(function(x, y, z) m[[x]][[y]]$suit[[z]], grd[[1]], grd[[2]], grd[[3]])
结果:
> grd
try iter suit value
1 try1 1 diamonds 2
2 try2 1 diamonds 400
3 try1 2 diamonds 45
4 try2 2 diamonds 5000
5 try1 1 clubs 5
6 try2 1 clubs 300
7 try1 2 clubs 67
8 try2 2 clubs 6000
但是,我想知道是否有更多 general/concise 方法来重现上述结果(最好是在 base R 中)? 我正在考虑提取花色来自每个子列表的元素,然后在结果列表上递归地使用类似 stack()
的东西:
rapply(m, function(x) setNames(stack(x), names(x)))
但这会引发错误,我不太清楚为什么,也不知道该用什么代替它。
我们可以结合使用 map
和 melt
library(purrr)
library(reshape2)
library(dplyr)
map_df(m, ~ .x %>%
map(pluck, "suit") %>%
melt, .id = 'try')
或 enframe
和 map
library(tibble)
map_df(m, ~ .x %>%
map_df(pluck, "suit") %>%
map_df(~ enframe(.x, name = "iter") %>%
unnest, .id = "suit"), .id = 'try' )
# A tibble: 8 x 4
# try suit iter value
# <chr> <chr> <int> <dbl>
#1 try1 diamonds 1 2
#2 try1 diamonds 2 45
#3 try1 clubs 1 5
#4 try1 clubs 2 67
#5 try2 diamonds 1 400
#6 try2 diamonds 2 5000
#7 try2 clubs 1 300
#8 try2 clubs 2 6000
我有一个包含 3 层的嵌套列表:
m = list(try1 = list(list(court = c("jack", "queen", "king"),
suit = list(diamonds = 2, clubs = 5)),
list(court = c("jack", "queen", "king"),
suit = list(diamonds = 45, clubs = 67))),
try2 = list(list(court = c("jack", "queen", "king"),
suit = list(diamonds = 400, clubs = 300)),
list(court = c("jack", "queen", "king"),
suit = list(diamonds = 5000, clubs = 6000))))
> str(m)
List of 2
$ try1:List of 2
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 2
.. .. ..$ clubs : num 5
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 45
.. .. ..$ clubs : num 67
$ try2:List of 2
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 400
.. .. ..$ clubs : num 300
..$ :List of 2
.. ..$ court: chr [1:3] "jack" "queen" "king"
.. ..$ suit :List of 2
.. .. ..$ diamonds: num 5000
.. .. ..$ clubs : num 6000
对于 try1
和 try2
中的每个子列表,我需要提取 suit
子列表并重新绑定其元素,以便生成的数据框为具有 4 列的长格式- value
(花色的值),suit
(标识值来自哪个花色,即方块或梅花),iter
(标识花色属于哪个子列表,即 1 或 2) 和 try
(try1 或 try2)。
我可以结合使用 expand.grid()
和 mapply()
:
grd = expand.grid(try = names(m), iter = 1:2, suit = c("diamonds", "clubs"))
grd$value = mapply(function(x, y, z) m[[x]][[y]]$suit[[z]], grd[[1]], grd[[2]], grd[[3]])
结果:
> grd
try iter suit value
1 try1 1 diamonds 2
2 try2 1 diamonds 400
3 try1 2 diamonds 45
4 try2 2 diamonds 5000
5 try1 1 clubs 5
6 try2 1 clubs 300
7 try1 2 clubs 67
8 try2 2 clubs 6000
但是,我想知道是否有更多 general/concise 方法来重现上述结果(最好是在 base R 中)? 我正在考虑提取花色来自每个子列表的元素,然后在结果列表上递归地使用类似 stack()
的东西:
rapply(m, function(x) setNames(stack(x), names(x)))
但这会引发错误,我不太清楚为什么,也不知道该用什么代替它。
我们可以结合使用 map
和 melt
library(purrr)
library(reshape2)
library(dplyr)
map_df(m, ~ .x %>%
map(pluck, "suit") %>%
melt, .id = 'try')
或 enframe
和 map
library(tibble)
map_df(m, ~ .x %>%
map_df(pluck, "suit") %>%
map_df(~ enframe(.x, name = "iter") %>%
unnest, .id = "suit"), .id = 'try' )
# A tibble: 8 x 4
# try suit iter value
# <chr> <chr> <int> <dbl>
#1 try1 diamonds 1 2
#2 try1 diamonds 2 45
#3 try1 clubs 1 5
#4 try1 clubs 2 67
#5 try2 diamonds 1 400
#6 try2 diamonds 2 5000
#7 try2 clubs 1 300
#8 try2 clubs 2 6000