r:通过将文件作为列表传递到 dplyr 中来编译 excel 个文件
r: compile excel files by passing them as a list in dplyr
我有两个 excel 文件,我想将它们编译成 r 中的单个数据框。
首先,创建要复制的 excel 个文件:
df1<- cbind.data.frame(var1= rnorm(10,4,1), var2= rnorm(10,5,1), var3= rnorm(10,7,4)) # create df1
df2<- cbind.data.frame(var1= rnorm(10,4,1), var2= rnorm(10,5,1), var3= rnorm(10,7,4)) # create df2
wb1<- openxlsx::createWorkbook() # create empty workbook1
wb2<- openxlsx::createWorkbook() # create empty workbook2
openxlsx::addWorksheet(wb1, "df1") # add sheet 1 to wb1
openxlsx::addWorksheet(wb1, "df2") # add sheet 2 to wb1
openxlsx::addWorksheet(wb2, "df1") # add sheet 1 to wb2
openxlsx::addWorksheet(wb2, "df2") # add sheet 2 to wb2
openxlsx::writeData(wb1, "df1", df1) # write df1
openxlsx::writeData(wb1, "df2", df2) # write df2
openxlsx::writeData(wb2, "df1", df1) # write df1
openxlsx::writeData(wb2, "df2", df2) # write df2
openxlsx::saveWorkbook(wb1, 'wb1.xlsx') # save wb1
openxlsx::saveWorkbook(wb2, 'wb2.xlsx') # save wb2
我下面的函数将单个指定的 Excel 文件编译成 Dataframe,但我想获取所有文件并以编程方式编译它们:
dfCompiled <- 'wb1.xlsx' %>% # rename DF and input your file name here
getSheetNames() %>%
set_names() %>%
map(read.xlsx, xlsxFile = 'wb1.xlsx', # file name here
colNames = TRUE) %>%
as.data.frame()
调用数据框来验证它是否有效:
> dfCompiled
df1.var1 df1.var2 df1.var3 df2.var1 df2.var2 df2.var3
1 3.356598 4.441104 7.95931350 3.968744 3.349242 2.1997116
2 3.151004 4.822166 0.39571905 4.679021 6.230923 12.8589661
3 3.581085 6.367498 -0.06415929 5.810634 4.207270 9.9430692
...
通过这些语句 运行 以下列表的最佳方法是什么,以便所有工作表都编译到一个数据框中?
filelist<- list("wb1.xlsx", "wb2.xlsx" )
类似于:
library(tidyverse)
df_compile <- function(file){
getSheetNames(file) %>%
set_names() %>%
map(read.xlsx, xlsxFile = file,
colNames = TRUE) %>%
as.data.frame()
}
filelist<- list("wb1.xlsx", "wb2.xlsx")
# Assuming same column names per xlsx file
map(filelist, df_compile) %>%
map_df(bind_rows)
我有两个 excel 文件,我想将它们编译成 r 中的单个数据框。
首先,创建要复制的 excel 个文件:
df1<- cbind.data.frame(var1= rnorm(10,4,1), var2= rnorm(10,5,1), var3= rnorm(10,7,4)) # create df1
df2<- cbind.data.frame(var1= rnorm(10,4,1), var2= rnorm(10,5,1), var3= rnorm(10,7,4)) # create df2
wb1<- openxlsx::createWorkbook() # create empty workbook1
wb2<- openxlsx::createWorkbook() # create empty workbook2
openxlsx::addWorksheet(wb1, "df1") # add sheet 1 to wb1
openxlsx::addWorksheet(wb1, "df2") # add sheet 2 to wb1
openxlsx::addWorksheet(wb2, "df1") # add sheet 1 to wb2
openxlsx::addWorksheet(wb2, "df2") # add sheet 2 to wb2
openxlsx::writeData(wb1, "df1", df1) # write df1
openxlsx::writeData(wb1, "df2", df2) # write df2
openxlsx::writeData(wb2, "df1", df1) # write df1
openxlsx::writeData(wb2, "df2", df2) # write df2
openxlsx::saveWorkbook(wb1, 'wb1.xlsx') # save wb1
openxlsx::saveWorkbook(wb2, 'wb2.xlsx') # save wb2
我下面的函数将单个指定的 Excel 文件编译成 Dataframe,但我想获取所有文件并以编程方式编译它们:
dfCompiled <- 'wb1.xlsx' %>% # rename DF and input your file name here
getSheetNames() %>%
set_names() %>%
map(read.xlsx, xlsxFile = 'wb1.xlsx', # file name here
colNames = TRUE) %>%
as.data.frame()
调用数据框来验证它是否有效:
> dfCompiled
df1.var1 df1.var2 df1.var3 df2.var1 df2.var2 df2.var3
1 3.356598 4.441104 7.95931350 3.968744 3.349242 2.1997116
2 3.151004 4.822166 0.39571905 4.679021 6.230923 12.8589661
3 3.581085 6.367498 -0.06415929 5.810634 4.207270 9.9430692
...
通过这些语句 运行 以下列表的最佳方法是什么,以便所有工作表都编译到一个数据框中?
filelist<- list("wb1.xlsx", "wb2.xlsx" )
类似于:
library(tidyverse)
df_compile <- function(file){
getSheetNames(file) %>%
set_names() %>%
map(read.xlsx, xlsxFile = file,
colNames = TRUE) %>%
as.data.frame()
}
filelist<- list("wb1.xlsx", "wb2.xlsx")
# Assuming same column names per xlsx file
map(filelist, df_compile) %>%
map_df(bind_rows)