如何提取文件名中的日期并对其进行排序以查找最新文件?
how to extract date in file name and sort it to find the latest file?
我目前在一个文件夹中有几个文件。它包含库存的每日更新。它看起来像这样。
Onhand Harian 12 Juli 2019.xlsx
Onhand Harian 13 Juli 2019.xlsx
Onhand Harian 14 Juli 2019.xlsx... and so on.
我只想阅读最新的 excel 文件,使用文件名上的日期。如何做到这一点?提前谢谢
如果您的所有文件都包含相同的名称,您可以这样做
#List all the file names in the folder
file_names <- list.files("/path/to/folder/", full.names = TRUE)
#Remove all unwanted characters and keep only the date
#Convert the date string to actual Date object
#Sort them and take the latest file
file_to_read <- file_names[order(as.Date(sub("Onhand Harian ", "",
sub(".xlsx$", "", basename(file_names))), "%d %B %Y"), decreasing = TRUE)[1]]
显然,如果您的文件每天都生成,您也可以使用 file.info
根据创建或修改时间来 select 它们? the post 中的详细信息。
我会做类似的事情:
library(stringr)
library(tidyverse)
x <- c("Onhand Harian 12 Juli 2019.xlsx",
"Onhand Harian 13 Juli 2019.xlsx",
"Onhand Harian 14 Juli 2019.xlsx")
lookup <- set_names(seq_len(12),
c("Januar", "Februar", "März", "April", "Mai", "Juni", "Juli",
"August", "September", "Oktober", "November", "Dezember"))
enframe(x, name = NULL, value = "txt") %>%
mutate(txt_extract = str_extract(txt, "\d{1,2} \D{3,9} \d{4}")) %>% # September is longest ..
separate(txt_extract, c("d", "m", "y"), remove = FALSE) %>%
mutate(m = sprintf("%02d", lookup[m]),
d = sprintf("%02d", as.integer(d))) %>%
mutate(date = as.Date(str_c(y, m, d), format = "%Y%m%d")) %>%
filter(date == max(date)) %>%
pull(txt)
# "Onhand Harian 14 Juli 2019.xlsx"
我目前在一个文件夹中有几个文件。它包含库存的每日更新。它看起来像这样。
Onhand Harian 12 Juli 2019.xlsx
Onhand Harian 13 Juli 2019.xlsx
Onhand Harian 14 Juli 2019.xlsx... and so on.
我只想阅读最新的 excel 文件,使用文件名上的日期。如何做到这一点?提前谢谢
如果您的所有文件都包含相同的名称,您可以这样做
#List all the file names in the folder
file_names <- list.files("/path/to/folder/", full.names = TRUE)
#Remove all unwanted characters and keep only the date
#Convert the date string to actual Date object
#Sort them and take the latest file
file_to_read <- file_names[order(as.Date(sub("Onhand Harian ", "",
sub(".xlsx$", "", basename(file_names))), "%d %B %Y"), decreasing = TRUE)[1]]
显然,如果您的文件每天都生成,您也可以使用 file.info
根据创建或修改时间来 select 它们? the post 中的详细信息。
我会做类似的事情:
library(stringr)
library(tidyverse)
x <- c("Onhand Harian 12 Juli 2019.xlsx",
"Onhand Harian 13 Juli 2019.xlsx",
"Onhand Harian 14 Juli 2019.xlsx")
lookup <- set_names(seq_len(12),
c("Januar", "Februar", "März", "April", "Mai", "Juni", "Juli",
"August", "September", "Oktober", "November", "Dezember"))
enframe(x, name = NULL, value = "txt") %>%
mutate(txt_extract = str_extract(txt, "\d{1,2} \D{3,9} \d{4}")) %>% # September is longest ..
separate(txt_extract, c("d", "m", "y"), remove = FALSE) %>%
mutate(m = sprintf("%02d", lookup[m]),
d = sprintf("%02d", as.integer(d))) %>%
mutate(date = as.Date(str_c(y, m, d), format = "%Y%m%d")) %>%
filter(date == max(date)) %>%
pull(txt)
# "Onhand Harian 14 Juli 2019.xlsx"