将文件拆分成更小的文件,工作脚本,但需要一些调整

Splitting file into smaller files, working script, but need some tweaks

我这里有一个脚本,它在包含多个报告的文本文件中查找分隔符。该脚本将每个单独的报告保存为它自己的文本文档。我试图实现的调整是:

在每个页面的数据中间有 - SPEC #: RX:<string>. 我希望将该字符串保存为文件名。

它目前从分隔符向下保存到下一个。这会忽略第一份报告并在之后抓住每一份报告。我希望它从分隔符 UP 转到下一个分隔符,但我还不知道如何实现。

$InPC = "C:\Users\path"
Get-ChildItem -Path $InPC -Filter *.txt | ForEach-Object -Process {
$basename= $_.BaseName
$m = ( ( Get-Content $_.FullName | Where { $_ | Select-String "END OF 
REPORT" -Quiet } | Measure-Object | ForEach-Object { $_.Count } ) -ge 2)
$a = 1
if ($m) {
Get-Content $_.FullName | % {

If ($_ -match "END OF REPORT") {
$OutputFile = "$InPC$basename _$a.txt"
$a++
}
Add-Content $OutputFile $_
}
Remove-Item $_.FullName
}
}

这是有效的,如前所述,它输出的文件顶部带有 END OF REPORT,文件中的第一个报告被省略,因为它上面没有 END OF REPORT

编辑代码:

$InPC = 'C:\Path' #

ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
    $RepNum=0
    ForEach($Report in (([IO.File]::ReadAllText('C:\Path'$File) -split 'END OF REPORT\r?\n?' -ne '')){
        if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
            $ReportFile=$Matches.ReportFile
        }
    $OutputFile = "{0}\{1}_{2}_{3}.txt" -f  $InPC,$File.BaseName,$ReportFile,++$RepNum
    $Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}

我建议使用正则表达式来

  • 使用 -raw 参数读入文件并且
  • 在标记 END OF REPORT 处将文件拆分为多个部分
  • 使用带有命名捕获组的 'SPEC #: RX:(?<ReportFile>.*?)\.' 来提取 string

适用于 PowerShell v2 的编辑

## Q:\Test19\SO_57911471.ps1
$InPC = 'C:\Users\path' # 'Q:\Test19\' # 

ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
    $RepNum=0
    ForEach($Report in (((Get-Content $File.FullName) -join "`n") -split 'END OF REPORT\r?\n?' -ne '')){

        if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
            $ReportFile=$Matches.ReportFile
        }
        $OutputFile = "{0}\{1}_{2}_{3}.txt" -f  $InPC,$File.BaseName,$ReportFile,++$RepNum
        $Report | Add-Content $OutputFile
    }
    # Remove-Item $File.FullName
}

此解释示例文本:

## Q:\Test19\SO_57911471.txt
I have a script here that looks for a delimiter in a text file with several reports in it.  
In the middle of the data of each page there is - 
SPEC #: RX:string1.  
I want that string to be saved as the filename.
END OF REPORT

I have a script here that looks for a delimiter in a text file with several reports in it.  
In the middle of the data of each page there is - 
SPEC #: RX:string2.  
I want that string to be saved as the filename.
END OF REPORT

产量:

> Get-ChildItem *string* -name
SO_57911471_string1_1.txt
SO_57911471_string2_2.txt

添加的 ReportNum 只是一种预防措施,以防无法对字符串进行 grep。