将文件拆分成更小的文件,工作脚本,但需要一些调整
Splitting file into smaller files, working script, but need some tweaks
我这里有一个脚本,它在包含多个报告的文本文件中查找分隔符。该脚本将每个单独的报告保存为它自己的文本文档。我试图实现的调整是:
在每个页面的数据中间有 - SPEC #: RX:<string>.
我希望将该字符串保存为文件名。
它目前从分隔符向下保存到下一个。这会忽略第一份报告并在之后抓住每一份报告。我希望它从分隔符 UP 转到下一个分隔符,但我还不知道如何实现。
$InPC = "C:\Users\path"
Get-ChildItem -Path $InPC -Filter *.txt | ForEach-Object -Process {
$basename= $_.BaseName
$m = ( ( Get-Content $_.FullName | Where { $_ | Select-String "END OF
REPORT" -Quiet } | Measure-Object | ForEach-Object { $_.Count } ) -ge 2)
$a = 1
if ($m) {
Get-Content $_.FullName | % {
If ($_ -match "END OF REPORT") {
$OutputFile = "$InPC$basename _$a.txt"
$a++
}
Add-Content $OutputFile $_
}
Remove-Item $_.FullName
}
}
这是有效的,如前所述,它输出的文件顶部带有 END OF REPORT
,文件中的第一个报告被省略,因为它上面没有 END OF REPORT
。
编辑代码:
$InPC = 'C:\Path' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (([IO.File]::ReadAllText('C:\Path'$File) -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
我建议使用正则表达式来
- 使用 -raw 参数读入文件并且
- 在标记
END OF REPORT
处将文件拆分为多个部分
- 使用带有命名捕获组的
'SPEC #: RX:(?<ReportFile>.*?)\.'
来提取 string
适用于 PowerShell v2 的编辑
## Q:\Test19\SO_57911471.ps1
$InPC = 'C:\Users\path' # 'Q:\Test19\' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (((Get-Content $File.FullName) -join "`n") -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
此解释示例文本:
## Q:\Test19\SO_57911471.txt
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string1.
I want that string to be saved as the filename.
END OF REPORT
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string2.
I want that string to be saved as the filename.
END OF REPORT
产量:
> Get-ChildItem *string* -name
SO_57911471_string1_1.txt
SO_57911471_string2_2.txt
添加的 ReportNum 只是一种预防措施,以防无法对字符串进行 grep。
我这里有一个脚本,它在包含多个报告的文本文件中查找分隔符。该脚本将每个单独的报告保存为它自己的文本文档。我试图实现的调整是:
在每个页面的数据中间有 - SPEC #: RX:<string>.
我希望将该字符串保存为文件名。
它目前从分隔符向下保存到下一个。这会忽略第一份报告并在之后抓住每一份报告。我希望它从分隔符 UP 转到下一个分隔符,但我还不知道如何实现。
$InPC = "C:\Users\path"
Get-ChildItem -Path $InPC -Filter *.txt | ForEach-Object -Process {
$basename= $_.BaseName
$m = ( ( Get-Content $_.FullName | Where { $_ | Select-String "END OF
REPORT" -Quiet } | Measure-Object | ForEach-Object { $_.Count } ) -ge 2)
$a = 1
if ($m) {
Get-Content $_.FullName | % {
If ($_ -match "END OF REPORT") {
$OutputFile = "$InPC$basename _$a.txt"
$a++
}
Add-Content $OutputFile $_
}
Remove-Item $_.FullName
}
}
这是有效的,如前所述,它输出的文件顶部带有 END OF REPORT
,文件中的第一个报告被省略,因为它上面没有 END OF REPORT
。
编辑代码:
$InPC = 'C:\Path' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (([IO.File]::ReadAllText('C:\Path'$File) -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
我建议使用正则表达式来
- 使用 -raw 参数读入文件并且
- 在标记
END OF REPORT
处将文件拆分为多个部分 - 使用带有命名捕获组的
'SPEC #: RX:(?<ReportFile>.*?)\.'
来提取string
适用于 PowerShell v2 的编辑
## Q:\Test19\SO_57911471.ps1
$InPC = 'C:\Users\path' # 'Q:\Test19\' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (((Get-Content $File.FullName) -join "`n") -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
此解释示例文本:
## Q:\Test19\SO_57911471.txt
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string1.
I want that string to be saved as the filename.
END OF REPORT
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string2.
I want that string to be saved as the filename.
END OF REPORT
产量:
> Get-ChildItem *string* -name
SO_57911471_string1_1.txt
SO_57911471_string2_2.txt
添加的 ReportNum 只是一种预防措施,以防无法对字符串进行 grep。