使用 map / apply 函数迭代执行连接的整洁方法
tidy way to perform joins iteratively with map / apply functions
我想 join/merge 使用 map
/lapply
多个 tibbles/data 帧。怎么可能做到这一点?
可重现的例子:
set.seed(42)
df <- tibble::tibble(rank = rep(stringr::str_c("rank",1:10),10),
char_1 = sample(c("a","b","c"), size = 100, replace = TRUE),
points = sample(1:10000, size = 100)
)
my_top <- seq(10,90, by= 10) %>%
as.list() %>%
set_names(c(stringr::str_c("sample_",1:9)))
my_list_1 <- map(my_top , ~ df %>%
sample_n(.x) %>%
mutate(!!str_c(.x, "_score") := sample(1:10000, size = .x)))
我想执行此操作:
df %>% group_by(rank, char_1, points) %>%
left_join(my_list_1[[1]] ) %>%
left_join(my_list_1[[2]] ) %>%
left_join(my_list_1[[3]] )
等等......具有map
功能。
我试过这个:
map(as.list(names(my_top)), ~ df %>% group_by(rank, char_1, points) %>%
left_join(my_list_1[[.x]] ))
但是,当然,它不会将已连接的小标题保存在某处以便与它进行新的连接!
一个选项是reduce
library(dplyr)
library(purrr)
df %>%
group_by(rank, char_1, points) %>%
list(.) %>%
c(., my_list_1[1:3]) %>%
reduce(left_join)
这是我的第一个回答,我是新来的。我最近遇到了类似的问题,join_all 是我找到的最佳解决方案。
library(plyr)
#list files that are saved in your computer, for example, in txt format
files <- list.files("path", *.txt)
# open the files and save then as a list
list_of_data_frames <- lapply(files, read_delim, delim = "\t")
# merge files
merged_file <- join_all(list_of_data_frames, by = NULL)
我想 join/merge 使用 map
/lapply
多个 tibbles/data 帧。怎么可能做到这一点?
可重现的例子:
set.seed(42)
df <- tibble::tibble(rank = rep(stringr::str_c("rank",1:10),10),
char_1 = sample(c("a","b","c"), size = 100, replace = TRUE),
points = sample(1:10000, size = 100)
)
my_top <- seq(10,90, by= 10) %>%
as.list() %>%
set_names(c(stringr::str_c("sample_",1:9)))
my_list_1 <- map(my_top , ~ df %>%
sample_n(.x) %>%
mutate(!!str_c(.x, "_score") := sample(1:10000, size = .x)))
我想执行此操作:
df %>% group_by(rank, char_1, points) %>%
left_join(my_list_1[[1]] ) %>%
left_join(my_list_1[[2]] ) %>%
left_join(my_list_1[[3]] )
等等......具有map
功能。
我试过这个:
map(as.list(names(my_top)), ~ df %>% group_by(rank, char_1, points) %>%
left_join(my_list_1[[.x]] ))
但是,当然,它不会将已连接的小标题保存在某处以便与它进行新的连接!
一个选项是reduce
library(dplyr)
library(purrr)
df %>%
group_by(rank, char_1, points) %>%
list(.) %>%
c(., my_list_1[1:3]) %>%
reduce(left_join)
这是我的第一个回答,我是新来的。我最近遇到了类似的问题,join_all 是我找到的最佳解决方案。
library(plyr)
#list files that are saved in your computer, for example, in txt format
files <- list.files("path", *.txt)
# open the files and save then as a list
list_of_data_frames <- lapply(files, read_delim, delim = "\t")
# merge files
merged_file <- join_all(list_of_data_frames, by = NULL)