使用 R 将数据帧列表保存到一个 excel 中,其中包含许多工作表

Save a list of dataframes to one excel with many sheets with R

给定由以下代码生成的数据帧列表 dfs

df <- structure(list(id = c("M0000607", "M0000609", "M0000612"), `2021-08(actual)` = c(12.6, 
19.2, 8.3), `2021-09(actual)` = c(10.3, 17.3, 6.4), `2021-10(actual)` = c(8.9, 
15.7, 5.3), `2021-11(actual)` = c(7.3, 14.8, 3.1), `2021-12(actual)` = c(6.1, 
14.2, 3.5), `2021-08(pred)` = c(11.65443222, 14.31674997, 7.084180415
), `2021-09(pred)` = c(12.29810914, 17.7143733, 6.057927385), 
    `2021-10(pred)` = c(9.619846116, 15.54553601, 6.525992602
    ), `2021-11(pred)` = c(8.352097939, 13.97318204, 3.164682627
    ), `2021-12(pred)` = c(6.113631596, 14.16243166, 3.288372517
    ), `2021-08(error)` = c(2.082307066, 1.146759554, 0.687406723
    ), `2021-09(error)` = c(1.631350383, 2.753457736, 2.952737781
    ), `2021-10(error)` = c(0.945567783, 4.883250027, 1.215819585
    ), `2021-11(error)` = c(1.998109138, 0.414373304, 0.342072615
    ), `2021-12(error)` = c(0.719846116, 0.154463985, 1.225992602
    )), class = "data.frame", row.names = c(NA, -3L))

year_months <- c('2021-12', '2021-11', '2021-10')  
curr <- lubridate::ym(year_months)
prev <- curr - months(2L)
dfs <- mapply(function(x, y) {
  df[c(
    "id", 
    format(seq.Date(y, x, by = "month"), "%Y-%m(actual)"), 
    format(x, "%Y-%m(pred)"), 
    format(x, "%Y-%m(error)")
  )]
}, curr, prev, SIMPLIFY = FALSE)
print(dfs)

输出:

[[1]]
        id 2021-10(actual) 2021-11(actual) 2021-12(actual) 2021-12(pred) 2021-12(error)
1 M0000607             8.9             7.3             6.1      6.113632      0.7198461
2 M0000609            15.7            14.8            14.2     14.162432      0.1544640
3 M0000612             5.3             3.1             3.5      3.288373      1.2259926

[[2]]
        id 2021-09(actual) 2021-10(actual) 2021-11(actual) 2021-11(pred) 2021-11(error)
1 M0000607            10.3             8.9             7.3      8.352098      1.9981091
2 M0000609            17.3            15.7            14.8     13.973182      0.4143733
3 M0000612             6.4             5.3             3.1      3.164683      0.3420726

[[3]]
        id 2021-08(actual) 2021-09(actual) 2021-10(actual) 2021-10(pred) 2021-10(error)
1 M0000607            12.6            10.3             8.9      9.619846      0.9455678
2 M0000609            19.2            17.3            15.7     15.545536      4.8832500
3 M0000612             8.3             6.4             5.3      6.525993      1.2158196

如果 dfs 有大量元素,我如何使用 apply 族函数或 purrr::map 将它们保存到一个 excel 和许多 [=42] =]s?

PS:sheet 将由最后一列名称的 %Y-%m 命名。

保存多个数据帧而不应用函数:

library(openxlsx)
# define sheet names for each data frame
dataset_names <- list('Sheet1' = dfs[1], 'Sheet2' = dfs[2], 'Sheet3' = dfs[3])
# export each data frame to separate sheets in same Excel file
openxlsx::write.xlsx(dataset_names, file = 'mydata.xlsx') 

获取最后一列的名称:

rev(colnames(dfs[[1]]))[1]

输出:

"2021-12(error)"

对于此示例数据,我将有一个 excel 文件 mydata.xlsx,其中 sheet 个名称:2021-10, 2021-11, 2021-12.

你可以这样做:

library(tidyverse)
names(dfs) <- lapply(dfs, function(x) x |> select(last_col()) |> names())

openxlsx::write.xlsx(dfs, file = 'mydata.xlsx') 

更新:如果你想要通用的“Sheet X”名称,你可以这样做:

names(dfs) <- paste0("Sheet", 1:length(dfs))

更新 2:删除“(错误)”部分:

names(dfs) <- str_remove(lapply(dfs, function(x) x |> select(last_col()) |> names()), "\(error\)")