PowerShell:要插入到 .CSV 中的模式匹配文本文件内容
PowerShell: Pattern Matching Text File Contents to Insert into .CSV
我一直在努力成功地分解文本文件的内容并将它们插入到具有以下规则的 .csv 中:
- 包含“>”的行应插入到 .csv 第 1 列
- 应将包含所有大写字母的行插入 .csv 的第 2 列,并且应连接每个大写字母块(删除其 `r 或 `n)
- '>' 和 '*' 应在存在的地方删除
另外,我可以使用以下方法使第 1 列运行良好:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]@{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
并且对第 2 列使用类似的方法取得的成功有限(我似乎无法 -join
与 `r 或 `n 一起工作):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]@{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
但我不知道如何让两者在同一个代码块中工作以遍历由“>”分隔的每个部分and/or“*”。
以下是数据样本,供参考。
>9392290|2983921
FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA
GDFOUYUIOAGHIHUAGSD
>lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1
FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW
DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS
FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS
DFSAQSAHUSDFAHOUHGROUGRWE*
>jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31
ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI
GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA
AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO
FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP
GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG
GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA
GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW
EGWIOJNENIOPIOWINPEAWNPOI*
我建议使用 -split
操作:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] @{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display 输出:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…
我一直在努力成功地分解文本文件的内容并将它们插入到具有以下规则的 .csv 中:
- 包含“>”的行应插入到 .csv 第 1 列
- 应将包含所有大写字母的行插入 .csv 的第 2 列,并且应连接每个大写字母块(删除其 `r 或 `n)
- '>' 和 '*' 应在存在的地方删除
另外,我可以使用以下方法使第 1 列运行良好:
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^>') {
[pscustomobject]@{
'Part1' = (Select-String '^>' -InputObject $line) -replace '>', ''
}
}
}
$data | Out-File 'newfile.csv'
并且对第 2 列使用类似的方法取得的成功有限(我似乎无法 -join
与 `r 或 `n 一起工作):
$file = (Get-Content 'samplefile.txt')
$data = foreach ($line in $file) {
if ($line -match '^[A-Z].*') {
[pscustomobject]@{
'Part2' = (Select-String '^[A-Z].*' -InputObject $line) -replace '*', ''
}
}
}
$data | Out-File 'newfile.csv'
但我不知道如何让两者在同一个代码块中工作以遍历由“>”分隔的每个部分and/or“*”。
以下是数据样本,供参考。
>9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIA GDFOUYUIOAGHIHUAGSD >lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAW DFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNUPAPNUNPUFSAPNUSS FSADUHHULGWAUNUNWEANNIOEAWNUNIIIINNBSDNJLKNJKLAERGJKLHHJLKGS DFSAQSAHUSDFAHOUHGROUGRWE* >jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPI GWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIOAGPIOJSGNJHIOWEA AUHNHIOEANPIASPNIOICBNIOASGIOEGWPIOWEPPPPSAJPOJKGPWEAIOJJPIO FAWEIOPHGAHNIOPGWEOPPOEAWSPIOOPUIGSUIOGUIOPWAGIEOUIWEAOGUIOP GEIOJHIOJPWEPJIOWGEIOPHGANIONIOGEWANIOEGWOPIHNNPIOEGWIJOWEAG GEPUIEWUIOSZBHJENWNBENUEBMIPEWVMIEMUIAZWIPNBWEPEWIOJJKEAWPIA GWEPHIOEWNPOEWANNNPIOGWREIJUOGUHIOSNJJJJJJJJKVMVIOIPEGIOEAUW EGWIOJNENIOPIOWINPEAWNPOI*
我建议使用 -split
操作:
(Get-Content -Raw samplefile.txt) -split '(?m)^>(.+)' -ne '' |
ForEach-Object -Begin { $i = 0 } -Process {
if (++$i % 2) { # 1st, 3rd, ... result, i.e. the ">"-prefixed lines
$part1 = $_ # Save for later.
} else { # 2nd, 4th, ... result, i.e. the all-uppercase lines
[pscustomobject] @{ # Construct and output a custom object.
Part1 = $part1
Part2 = $_ -replace '\r?\n|\*$' # Remove newlines and trailing "*"
}
}
} # pipe to Export-Csv as needed.
To-display 输出:
Part1 Part2
----- -----
9392290|2983921 FYUOIQWEFYUOIAGSNJJJHKEWAHJKTHJEWUYIYGUIOIOIUYAFUIOWUEYOUYIAGDFOUYUIOAGHIHUAGSD
lsm.VI.superconfig_5640.1|lsm.model.superconfig_5640.1 FDASJKLHJKLGAHJKDFGHJKAGJKHUIGAHIULGRUOUHWWUGUIOHZIOJSHIJMAWDFSANJKLNJLWEQUIOGFDSOIYUBHPOGANUPPUNABNPUNU…
jfi.ZJ.superconfig_99.31|jfi.model.superconfig_99.31 ASDFUIOHPOASPNADPUNPNUSADFNUPPUOHZSABUHBAHPUDASPHAWHPOEWGHPIGWANUEGWUNPNPEANUPUNPEAWUPOGDFPOAGIJJIEOAWIO…