使用 powershell 从文件名获取时间戳
Obtaining timestamp from filename with powershell
我需要根据文件名中的日期对文件进行分组。
示例:
- input (folder)
-- random_folder_name_1 (folder)
--- 01-Apr-19, 10_33_37_Sample_1.pdf
-- random_folder_name_2 (folder)
--- some_other_file.pdf
--- 04-Apr-19, 14_33_37_Sample_15.pdf
...
所有文件都有一个模板:%datestamp%, %timestamp%_%keyword%
我需要将它们排序为:
- output (folder)
-- %datestamp% (folder)
--- %keyword%.pdf
我实现了遍历 input
文件夹并搜索 pdf
文件,但在获取日期戳时遇到了困难。
$origin_folder = "input"
$destination_folder = "output"
$origin = Join-Path -Path $(Get-Location) -ChildPath "$origin_folder"
$destination = Join-Path -Path $(Get-Location) -ChildPath "$destination_folder"
$files = Get-ChildItem -Path $origin -Recurse -Filter *.pdf
# RegEx for date stamp as day-3_leters_of_month-year
$regex = "\d{2}-\D{3}-\d{2}"
foreach ($file in $files) {
$source_file = $file.FullName
$datestamp = [regex]::Matches($file.BaseName, $regex)
Write-Output "$datestamp"
}
由于某种原因 $datestamp
是空字符串。这有什么问题吗?
另外,如何从文件名中减去正则表达式?
假设从文件名 %datestamp%, %timestamp%_%keyword%.pdf
减去 %datestamp%, %timestamp%_
得到 %keyword%.pdf
最终脚本:工作版本
$origin_folder = "input"
$destination_folder = "output"
$origin = Join-Path -Path $(Get-Location) -ChildPath "$origin_folder"
$destination = Join-Path -Path $(Get-Location) -ChildPath "$destination_folder"
# Get all files in subfolders
$files = Get-ChildItem -Path $origin -Recurse -Filter *.pdf
# Date Regular Expression
# '2 digits of day'-'3 symbols of month'-'2 digits of year'
# Equals to template 'dd-MMM-yy'
$date_regex = "\d{2}\-\w{1,3}\-\d{2}"
# Ballast Regular Expressions
# Equals to template 'dd-MMM-yy, hh_mm_ss_'
$ballast_regex = "\d{2}\-\w{1,3}\-\d{2}, \d{2}_\d{2}_\d{2}_"
# Walk through all found files
foreach ($file in $files){
# Get the full address of file which needs to be copied
$source_file = $file.FullName
# Get the datestamp from filename
$datestamp = [regex]::Matches($file.BaseName, $date_regex)
# Convert into usable format with digits only in filename
$datestamp = [datetime]::parseexact($datestamp, 'dd-MMM-yy', $null).ToString('yyyy-MM-dd')
# Take the name of sample from filename
$keyword = $file.Name -replace $ballast_regex
# Create the folder based on date stamp
$destination_subfolder = Join-Path -Path $destination -ChildPath $datestamp
# Create the folder based on datestamp if it doesn't exist
If(!(Test-Path -Path $destination_subfolder))
{
# Create folder silently
# To make it "as usual" : remove " | Out-Null" from the end
New-Item -Path $destination_subfolder -ItemType Directory -Force | Out-Null
}
# Path of file where it will be copied, but with changed name to sample name only
$destination_file = Join-Path -Path $destination_subfolder -ChildPath $keyword
# Copy actual file
Copy-Item $source_file -Destination $destination_file
}
我更改了你的正则表达式
$regex = "\d{2}-\D{3}-\d{4}"
对此:
$regex = "\d{2}\-\w{1,3}\-\d{2,4}"
现在可以正确获取日期了。
因此,约会这一事实看起来并不重要。您不是要解析它,您只是想要它的原始文本。所以就开始吧,我制定了一个 RegEx,可以从文件名末尾获取它和示例数据。
'01-Apr-19, 10_33_37_Sample_1.pdf'|?{$_ -match '^(.+?), \d\d_\d\d_\d\d_(.+)\....$'}|%{$Matches[1],$Matches[2]}
我需要根据文件名中的日期对文件进行分组。 示例:
- input (folder)
-- random_folder_name_1 (folder)
--- 01-Apr-19, 10_33_37_Sample_1.pdf
-- random_folder_name_2 (folder)
--- some_other_file.pdf
--- 04-Apr-19, 14_33_37_Sample_15.pdf
...
所有文件都有一个模板:%datestamp%, %timestamp%_%keyword%
我需要将它们排序为:
- output (folder)
-- %datestamp% (folder)
--- %keyword%.pdf
我实现了遍历 input
文件夹并搜索 pdf
文件,但在获取日期戳时遇到了困难。
$origin_folder = "input"
$destination_folder = "output"
$origin = Join-Path -Path $(Get-Location) -ChildPath "$origin_folder"
$destination = Join-Path -Path $(Get-Location) -ChildPath "$destination_folder"
$files = Get-ChildItem -Path $origin -Recurse -Filter *.pdf
# RegEx for date stamp as day-3_leters_of_month-year
$regex = "\d{2}-\D{3}-\d{2}"
foreach ($file in $files) {
$source_file = $file.FullName
$datestamp = [regex]::Matches($file.BaseName, $regex)
Write-Output "$datestamp"
}
由于某种原因 $datestamp
是空字符串。这有什么问题吗?
另外,如何从文件名中减去正则表达式?
假设从文件名 %datestamp%, %timestamp%_%keyword%.pdf
减去 %datestamp%, %timestamp%_
得到 %keyword%.pdf
最终脚本:工作版本
$origin_folder = "input"
$destination_folder = "output"
$origin = Join-Path -Path $(Get-Location) -ChildPath "$origin_folder"
$destination = Join-Path -Path $(Get-Location) -ChildPath "$destination_folder"
# Get all files in subfolders
$files = Get-ChildItem -Path $origin -Recurse -Filter *.pdf
# Date Regular Expression
# '2 digits of day'-'3 symbols of month'-'2 digits of year'
# Equals to template 'dd-MMM-yy'
$date_regex = "\d{2}\-\w{1,3}\-\d{2}"
# Ballast Regular Expressions
# Equals to template 'dd-MMM-yy, hh_mm_ss_'
$ballast_regex = "\d{2}\-\w{1,3}\-\d{2}, \d{2}_\d{2}_\d{2}_"
# Walk through all found files
foreach ($file in $files){
# Get the full address of file which needs to be copied
$source_file = $file.FullName
# Get the datestamp from filename
$datestamp = [regex]::Matches($file.BaseName, $date_regex)
# Convert into usable format with digits only in filename
$datestamp = [datetime]::parseexact($datestamp, 'dd-MMM-yy', $null).ToString('yyyy-MM-dd')
# Take the name of sample from filename
$keyword = $file.Name -replace $ballast_regex
# Create the folder based on date stamp
$destination_subfolder = Join-Path -Path $destination -ChildPath $datestamp
# Create the folder based on datestamp if it doesn't exist
If(!(Test-Path -Path $destination_subfolder))
{
# Create folder silently
# To make it "as usual" : remove " | Out-Null" from the end
New-Item -Path $destination_subfolder -ItemType Directory -Force | Out-Null
}
# Path of file where it will be copied, but with changed name to sample name only
$destination_file = Join-Path -Path $destination_subfolder -ChildPath $keyword
# Copy actual file
Copy-Item $source_file -Destination $destination_file
}
我更改了你的正则表达式
$regex = "\d{2}-\D{3}-\d{4}"
对此:
$regex = "\d{2}\-\w{1,3}\-\d{2,4}"
现在可以正确获取日期了。
因此,约会这一事实看起来并不重要。您不是要解析它,您只是想要它的原始文本。所以就开始吧,我制定了一个 RegEx,可以从文件名末尾获取它和示例数据。
'01-Apr-19, 10_33_37_Sample_1.pdf'|?{$_ -match '^(.+?), \d\d_\d\d_\d\d_(.+)\....$'}|%{$Matches[1],$Matches[2]}