如何根据 R (tidyverse) 中拆分数据框的变量编写具有多个工作表的多个 excel 文件

How to write multiple excel files with multiple sheets based on a variable of a split data frame in R (tidyverse)

我想拆分一个数据框,以便根据原始拆分数据框的另一个变量生成包含多个工作表的多个 excel 文件。 我使用的数据框是玩具数据集mtcars。我将其拆分为 cyl,以便根据变量 gear.

创建包含工作表的多个文件

所以,我希望得到三个 excel 文件,名称如下:

每个包含:

我所做的似乎覆盖了文件。

这是我所做的:

library(tidyverse)
library(writexl)

# I split the data frame
list_of_cars_by_cyl <- mtcars %>%
      dplyr::group_split(cyl)

# I gave names to split data frame elements
names(list_of_cars_by_cyl) <- list_of_cars_by_cyl %>%
      purrr::map(~pull(.,cyl)) %>% # individua le modalit? della variabile area
      purrr::map(~as.character(.)) %>% # converte il fattore in character
      purrr::map(~unique(.))

nomi <- names(list_of_cars_by_cyl)

# I create a function in order to save split data frames in .xlsx with several sheets based on a second variable
save_to_excel <- function(x) {
      # list by new variable
      list_of_cars_by_gear <- x %>%
            dplyr::group_split(gear)
      # name list's elements
      names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
            purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
            purrr::map(~as.character(.)) %>% # converte il fattore in character
            purrr::map(~unique(.))
      # save to .xlsx
      list_of_cars_by_gear %>%
            writexl::write_xlsx(path = paste(cartelle[5], paste0("Cars_by_cyl_", nomi, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), sep = "/"))
}

# run the function iteratively
list_of_cars_by_cyl %>% 
      purrr::map(save_to_excel)

在您的示例中,您为每个列表元素使用了相同的文件名 - nomi 在对 save_to_excel 的每次调用中使用相同。有(至少)两种方法可以解决这个问题:完全在 save_to_excel 函数中构建正确的文件名,或者使用 purrr::map2 模拟迭代 list_of_cars_by_cylnomi

两个选项都会产生预期的输出(3 个 Excel 文件,名称中包含 cyl 数字,每个文件都有 齿轮的相应工作表拆分)。

选项 1

从传递给 save_to_excel 的数据框中获取文件名的 cyl 部分:

save_to_excel <- function(x) {
      # list by new variable
      list_of_cars_by_gear <- x %>%
            dplyr::group_split(gear)
      # name list's elements
      names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
            purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
            purrr::map(~as.character(.)) %>% # converte il fattore in character
            purrr::map(~unique(.))

      # cyl part for current file name
      this_cyl <- unique(x$cyl)
      
      # save to .xlsx
      list_of_cars_by_gear %>%
            writexl::write_xlsx(path = paste(
              #cartelle[5], #not defined example code,
              paste0("Cars_by_cyl_", this_cyl, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), 
              sep = "/"))
}

list_of_cars_by_cyl %>% 
  purrr::map(save_to_excel)

#$`4`
#[1] "...\Cars_by_cyl_4_25032022_220454.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_220454.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_220454.xlsx"

选项 2

将第二个参数添加到 save_to_excel 并迭代数据帧和名称:

save_to_excel <- function(x, name) {
  # list by new variable
  list_of_cars_by_gear <- x %>%
    dplyr::group_split(gear)
  # name list's elements
  names(list_of_cars_by_gear) <- list_of_cars_by_gear %>%
    purrr::map(~pull(., gear)) %>% # individua le modalit? della variabile area
    purrr::map(~as.character(.)) %>% # converte il fattore in character
    purrr::map(~unique(.))
  # save to .xlsx
  list_of_cars_by_gear %>%
    writexl::write_xlsx(path =  paste(
              #cartelle[5], #not defined example code,
              paste0("Cars_by_cyl_", name, "_", format(Sys.time(), format = "%d%m%Y_%H%M%S"), ".xlsx"), 
              sep = "/"))
}

# run the function iteratively
# over dataframe list and nomi
list_of_cars_by_cyl %>% 
  purrr::map2(., nomi, save_to_excel)

#$`4`
#[1] "...\Cars_by_cyl_4_25032022_221107.xlsx"
#$`6`
#[1] "...\Cars_by_cyl_6_25032022_221107.xlsx"
#$`8`
#[1] "...\Cars_by_cyl_8_25032022_221107.xlsx"

purrr 选项

由于该问题还询问有关理解 purrr 逻辑的问题,因此这里有几种不同的方式来代替如何制定最后一步。所有结果都相同,哪个最容易 use/understand 恕我直言,这是个人喜好问题。

# refer to first, second argument with .x, .y
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = .x, name = .y))
# refer to first, second argument with ..1, ..2
purrr::map2(list_of_cars_by_cyl, nomi, ~save_to_excel(x = ..1, name = ..2))
# define anonymous function with appropriate number of arguments
purrr::map2(list_of_cars_by_cyl, nomi, function(x, y) {save_to_excel(x, y)})