如何根据 R (tidyverse) 中拆分数据框的变量编写具有多个工作表的多个 excel 文件
How to write multiple excel files with multiple sheets based on a variable of a split data frame in R (tidyverse)
我想拆分一个数据框,以便根据原始拆分数据框的另一个变量生成包含多个工作表的多个 excel 文件。
我使用的数据框是玩具数据集mtcars
。我将其拆分为 cyl
,以便根据变量 gear
.
创建包含工作表的多个文件
所以,我希望得到三个 excel 文件,名称如下:
- Cars_by_cyl_4_date×.xlsx
- Cars_by_cyl_6_date×.xlsx
- Cars_by_cyl_8_date×.xlsx
每个包含:
- 三张,命名为:“3”、“4”、“5”(Cars_by_cyl_4_date×.xlsx)
- 三张,命名为:“3”、“4”、“5”(Cars_by_cyl_6_date×.xlsx)
- 两张,命名为:“3”、“5”(Cars_by_cyl_8_date×.xlsx)。
我所做的似乎覆盖了文件。
这是我所做的:
library(tidyverse)
library(writexl)
# I split the data frame
list_of_cars_by_cyl <- mtcars %>%
dplyr::group_split(cyl)
# I gave names to split data frame elements
names(list_of_cars_by_cyl) <- list_of_cars_by_cyl %>%
purrr::map(~pull(.,cyl)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
nomi <- names(list_of_cars_by_cyl)
# I create a function in order to save split data frames in .xlsx with several sheets based on a second variable
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(cartelle[5], paste0("Cars_by_cyl_", nomi, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), sep = "/"))
}
# run the function iteratively
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
在您的示例中,您为每个列表元素使用了相同的文件名 - nomi
在对 save_to_excel
的每次调用中使用相同。有(至少)两种方法可以解决这个问题:完全在 save_to_excel
函数中构建正确的文件名,或者使用 purrr::map2
模拟迭代 list_of_cars_by_cyl
和 nomi
。
两个选项都会产生预期的输出(3 个 Excel 文件,名称中包含 cyl 数字,每个文件都有 齿轮的相应工作表拆分)。
选项 1
从传递给 save_to_excel
的数据框中获取文件名的 cyl 部分:
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# cyl part for current file name
this_cyl <- unique(x$cyl)
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", this_cyl, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
#$`4`
#[1] "...\Cars_by_cyl_4_25032022_220454.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_220454.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_220454.xlsx"
选项 2
将第二个参数添加到 save_to_excel
并迭代数据帧和名称:
save_to_excel <- function(x, name) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", name, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
# run the function iteratively
# over dataframe list and nomi
list_of_cars_by_cyl %>%
purrr::map2(., nomi, save_to_excel)
#$`4`
#[1] "...\Cars_by_cyl_4_25032022_221107.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_221107.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_221107.xlsx"
purrr 选项
由于该问题还询问有关理解 purrr 逻辑的问题,因此这里有几种不同的方式来代替如何制定最后一步。所有结果都相同,哪个最容易 use/understand 恕我直言,这是个人喜好问题。
# refer to first, second argument with .x, .y
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = .x, name = .y))
# refer to first, second argument with ..1, ..2
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = ..1, name = ..2))
# define anonymous function with appropriate number of arguments
purrr::map2(list_of_cars_by_cyl, nomi, function(x, y) {save_to_excel(x, y)})
我想拆分一个数据框,以便根据原始拆分数据框的另一个变量生成包含多个工作表的多个 excel 文件。
我使用的数据框是玩具数据集mtcars
。我将其拆分为 cyl
,以便根据变量 gear
.
所以,我希望得到三个 excel 文件,名称如下:
- Cars_by_cyl_4_date×.xlsx
- Cars_by_cyl_6_date×.xlsx
- Cars_by_cyl_8_date×.xlsx
每个包含:
- 三张,命名为:“3”、“4”、“5”(Cars_by_cyl_4_date×.xlsx)
- 三张,命名为:“3”、“4”、“5”(Cars_by_cyl_6_date×.xlsx)
- 两张,命名为:“3”、“5”(Cars_by_cyl_8_date×.xlsx)。
我所做的似乎覆盖了文件。
这是我所做的:
library(tidyverse)
library(writexl)
# I split the data frame
list_of_cars_by_cyl <- mtcars %>%
dplyr::group_split(cyl)
# I gave names to split data frame elements
names(list_of_cars_by_cyl) <- list_of_cars_by_cyl %>%
purrr::map(~pull(.,cyl)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
nomi <- names(list_of_cars_by_cyl)
# I create a function in order to save split data frames in .xlsx with several sheets based on a second variable
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(cartelle[5], paste0("Cars_by_cyl_", nomi, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), sep = "/"))
}
# run the function iteratively
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
在您的示例中,您为每个列表元素使用了相同的文件名 - nomi
在对 save_to_excel
的每次调用中使用相同。有(至少)两种方法可以解决这个问题:完全在 save_to_excel
函数中构建正确的文件名,或者使用 purrr::map2
模拟迭代 list_of_cars_by_cyl
和 nomi
。
两个选项都会产生预期的输出(3 个 Excel 文件,名称中包含 cyl 数字,每个文件都有 齿轮的相应工作表拆分)。
选项 1
从传递给 save_to_excel
的数据框中获取文件名的 cyl 部分:
save_to_excel <- function(x) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# cyl part for current file name
this_cyl <- unique(x$cyl)
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", this_cyl, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
list_of_cars_by_cyl %>%
purrr::map(save_to_excel)
#$`4`
#[1] "...\Cars_by_cyl_4_25032022_220454.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_220454.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_220454.xlsx"
选项 2
将第二个参数添加到 save_to_excel
并迭代数据帧和名称:
save_to_excel <- function(x, name) {
# list by new variable
list_of_cars_by_gear <- x %>%
dplyr::group_split(gear)
# name list's elements
names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
purrr::map(~as.character(.)) %>% # converte il fattore in character
purrr::map(~unique(.))
# save to .xlsx
list_of_cars_by_gear %>%
writexl::write_xlsx(path = paste(
#cartelle[5], #not defined example code,
paste0("Cars_by_cyl_", name, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"),
sep = "/"))
}
# run the function iteratively
# over dataframe list and nomi
list_of_cars_by_cyl %>%
purrr::map2(., nomi, save_to_excel)
#$`4`
#[1] "...\Cars_by_cyl_4_25032022_221107.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_221107.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_221107.xlsx"
purrr 选项
由于该问题还询问有关理解 purrr 逻辑的问题,因此这里有几种不同的方式来代替如何制定最后一步。所有结果都相同,哪个最容易 use/understand 恕我直言,这是个人喜好问题。
# refer to first, second argument with .x, .y
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = .x, name = .y))
# refer to first, second argument with ..1, ..2
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = ..1, name = ..2))
# define anonymous function with appropriate number of arguments
purrr::map2(list_of_cars_by_cyl, nomi, function(x, y) {save_to_excel(x, y)})