使用 Powershell 重新组装拆分文件名
Re-assembling split file names with Powershell
我在从文本文件重新组合某些文件名(并丢弃其余文件名)时遇到问题。文件名被分开(通常在三行)并且每个文件名后总是有一个空行。我只想保留以 OPEN
或 FOUR
开头的文件名。一个例子是:
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
我想要的输出是:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
感谢任何建议!
一次一行读取文件并继续连接它们直到遇到空行,此时输出连接的字符串并重复直到到达文件末尾:
# this variable will keep track of the partial file names
$fileName = ''
# use a switch to read the file and process each line
switch -Regex -File ('path\to\file.txt') {
# when we see a blank line...
'^\s*$' {
# ... we output it if it starts with the right word
if($s -cmatch '^(OPEN|FOUR)'){ $fileName }
# and then start over
$fileName = ''
}
default {
# must be a non-blank line, concatenate it to the previous ones
$s += $_
}
}
# remember to check and output the last one
if($s -cmatch '^(OPEN|FOUR)'){
$fileName
}
以下对我有用,你可以试一试。
- 参见https://regex101.com/r/JuzXOb/1 for the
Regex
解释。
$source = 'fullpath/to/inputfile.txt'
$destination = 'fullpath/to/resultfile.txt'
[regex]::Matches(
(Get-Content $source -Raw),
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') }) |
Out-File $destination
测试用:
$txt = @'
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
OPEN.492820.EXTR
A.EXAMPLE123
FOUR.383838.282.
STAND.848484.123
ZXC
'@
[regex]::Matches(
$txt,
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') })
输出:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
OPEN.492820.EXTRA.EXAMPLE123
FOUR.383838.282.STAND.848484.123ZXC
我在从文本文件重新组合某些文件名(并丢弃其余文件名)时遇到问题。文件名被分开(通常在三行)并且每个文件名后总是有一个空行。我只想保留以 OPEN
或 FOUR
开头的文件名。一个例子是:
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
我想要的输出是:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
感谢任何建议!
一次一行读取文件并继续连接它们直到遇到空行,此时输出连接的字符串并重复直到到达文件末尾:
# this variable will keep track of the partial file names
$fileName = ''
# use a switch to read the file and process each line
switch -Regex -File ('path\to\file.txt') {
# when we see a blank line...
'^\s*$' {
# ... we output it if it starts with the right word
if($s -cmatch '^(OPEN|FOUR)'){ $fileName }
# and then start over
$fileName = ''
}
default {
# must be a non-blank line, concatenate it to the previous ones
$s += $_
}
}
# remember to check and output the last one
if($s -cmatch '^(OPEN|FOUR)'){
$fileName
}
以下对我有用,你可以试一试。
- 参见https://regex101.com/r/JuzXOb/1 for the
Regex
解释。
$source = 'fullpath/to/inputfile.txt'
$destination = 'fullpath/to/resultfile.txt'
[regex]::Matches(
(Get-Content $source -Raw),
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') }) |
Out-File $destination
测试用:
$txt = @'
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
OPEN.492820.EXTR
A.EXAMPLE123
FOUR.383838.282.
STAND.848484.123
ZXC
'@
[regex]::Matches(
$txt,
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') })
输出:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
OPEN.492820.EXTRA.EXAMPLE123
FOUR.383838.282.STAND.848484.123ZXC