Powershell Compare-Object 为每个 SideIndicator 输出单独的文件

Powershell Compare-Object Output Separate Files for each SideIndicator

(这可能是我所缺少的相当简单的东西;但我似乎无法弄清楚,也没有在搜索中找到任何答案)

我需要比较两个具有相同列的CSV文件并输出行差异如下(最终输出为Unicode Text):

假设我有以下示例数据:

File A:
Column1,Column2,Column3
Tommy,4133,20180204
Suzie,5200,20210112
Tammy,221,20201010

File B:
Column1,Column2,Column3
Tommy,4133,20180204
Nicky,5200,20190520

这是我当前的代码(借用 the hash-enabled Compare-Object2 from this site 因为交付的 Compare-Object 太慢 -- 仅供参考,我使用 Get-Content 而不是 Import-Csv 因为它是因为我们比较整行,所以快了 50 倍。MyHeader 变量只是为了保留原始文件的 header 列值)

Compare-Object2 (Get-Content $FileA) (Get-Content $FileB) -PassThru |
Select-Object @{l=[string]$MyHeader;e={$_.InputObject}},
              @{n='Row Label'; e={ @{'=>' = 'Bad' ; '<=' = 'Good'}[$_.SideIndicator]}},
              @{n='Placeholder'; e={@{'*'='0'}['*']}} |
Sort-Object 'Row Label' -Descending | Export-Csv "$FinalCSV" -NoType;

#Removing " char to create CSV with original and added columns together
Set-Content "$FinalCSV" ((Get-Content "$FinalCSV") -replace '"');

#Convert csv to tab delimited
Import-Csv "$FinalCSV" | Export-Csv "$FinalTXT"  -NoTypeInformation -Delimiter "`t";

#Remove " char and convert to unicode
Set-Content -Encoding UNICODE "$FinalTXT" ((Get-Content "$FinalTXT") -replace '"')

这非常有效(我知道其中一些在最后是多余的;但是嘿:这是我能做的最好的了——但绝对可以随意修复这些部分!)创建一个单一的输出文件好的和坏的 -- 两个 40 万行的文件大约需要 40 秒。

Result File:
Column1 Column2 Column3 Row Label   Placeholder
Suzie   5200    20210112    Good    0
Tammy   221 20201010    Good    0
Nicky   5200    20210112    Bad 0

问题是,我现在需要将它们创建为 单独的 文件:一个好的文件,一个坏的文件。所以新需要的输出将是:

ResultFileGood:
Column1 Column2 Column3 Row Label   Placeholder
Suzie   5200    20210112    Good    0
Tammy   221 20201010    Good    0

ResultFileBad:
Column1 Column2 Column3 Row Label   Placeholder
Nicky   5200    20210112    Bad 0

而且我只知道必须有一种方法可以做到这一点,而不必 运行 比较两次 - 使用 Where-Object 道具或某种循环。我就是想不通;所以我来找专家了。

谢谢

编辑:多亏了 postanote,一个可行的替代方案是只输出合并的文件然后拆分它,这绝对比 运行 将整个比较例程进行两次更快。还是想看看有没有办法不用中间文件直接在对比导出中做;但这绝对是一个可行的选择,也是我目前正在使用的选择

$FinalHeader = get-content "$FinalTXT" | Select -First 1
$BadOutput = Select-String -Path $FinalTXT -Pattern ('Bad   0')
$GoodOutput = Select-String -Path $FinalTXT -Pattern ('Good 0')
@($FinalHeader,$BadOutput.Line) | Out-File "$FinalBadTXT" -Encoding UNICODE;
@($FinalHeader,$GoodOutput.Line) | Out-File "$FinalGoodTXT" -Encoding UNICODE;

继续我的评论。

你在那里发生了很多事情;即一些代理功能等

像你一样混合这些项目,你最终会得到这样的东西......(当然非常简单,并且由于你要展示你的输入,你迫使我们猜测得出一个。)

psEdit -filenames 'D:\temp\book1.txt'
# Results
<#
Site,Dept,Office,Floor
Main,aaa,bbb,ccc
Main0,aaa,bbb,ccc
Branch1,ddd,eee,fff
Branch2,ggg,hhh,iii
#>

psEdit -filenames 'D:\temp\book3.txt'
# Results
<#
Site,Dept,Office,Floor
Main,aaa,bbb,ccc
Branch1,ddd,eee,fff
Branch2,ggg,hhh,iii
Branch3,jjj,kkk,lll
Branch4,mmm,nnn,ooo
#>

更新:

删除所有以前的东西,因为它们不是你的菜...

;-}

Compare-Object2 -ReferenceObject (Get-Content -Path 'D:\temp\book1.txt') -DifferenceObject (Get-Content -Path 'D:\temp\book3.txt') | 
Export-Csv -Path 'D:\Temp\CompareObject.csv' -NoTypeInformation -Force

(Select-String -Path 'D:\Temp\CompareObject.csv' -Pattern '\<=') -replace '.*CompareObject.*:\"|\"\,.*' | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor | 
Export-Csv -Path 'D:\temp\ReferenceObject.csv' -NoTypeInformation -Force

(Select-String -Path 'D:\Temp\CompareObject.csv' -Pattern '\=>') -replace '.*CompareObject.*:\"|\"\,.*' | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor | 
Export-Csv -Path 'D:\temp\DifferenceObject.csv' -NoTypeInformation -Force

$FileList = 'ReferenceObject.csv', 'DifferenceObject.csv'

$FileList | 
ForEach-Object {
    "`n********* Getting content $PSItem *********`n"
    Import-Csv -Path  "D:\temp$PSItem"
}
# Results
<#
********* Getting content ReferenceObject.csv *********


Site    Dept Office Floor
----    ---- ------ -----
Main0   aaa  bbb    ccc  

********* Getting content DifferenceObject.csv *********

Branch3 jjj  kkk    lll  
Branch4 mmm  nnn    ooo 
#>

所以,至于你最后的评论:


While that method still uses the intermediary file; I admit I completely wasn't thinking about the simple approach of just exporting the combined file and then just splitting that.***

好的,那么,不用'intermediary file'。

($ComparedObjects = Compare-Object2 -ReferenceObject (Get-Content -Path 'D:\temp\book1.txt') -DifferenceObject (Get-Content -Path 'D:\temp\book3.txt'))
# Results
<#
InputObject         SideIndicator
-----------         -------------
Main0,aaa,bbb,ccc   <=           
Branch3,jjj,kkk,lll =>           
Branch4,mmm,nnn,ooo => 
#>

($ComparedObjects -match '<=').InputObject | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor 
# Results
<#
Site  Dept Office Floor
----  ---- ------ -----
Main0 aaa  bbb    ccc  
#>

($ComparedObjects -match '=>').InputObject | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor 
# Results
<#
Site    Dept Office Floor
----    ---- ------ -----
Branch3 jjj  kkk    lll  
Branch4 mmm  nnn    ooo 
#>

然后导出为 csv。

($ComparedObjects -match '<=').InputObject | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor | 
Export-Csv -Path 'D:\temp\ReferenceObject.csv' -NoTypeInformation -Force

($ComparedObjects -match '=>').InputObject | 
ConvertFrom-Csv -Header Site, Dept, Office, Floor | 
Export-Csv -Path 'D:\temp\DifferenceObject.csv' -NoTypeInformation -Force

根据需要回读

$FileList = 'ReferenceObject.csv', 'DifferenceObject.csv'

$FileList | 
ForEach-Object {
    "`n********* Getting content $PSItem *********`n"
    Import-Csv -Path  "D:\temp$PSItem"
}
# Results
<#
********* Getting content ReferenceObject.csv *********


Site    Dept Office Floor
----    ---- ------ -----
Main0   aaa  bbb    ccc  

********* Getting content DifferenceObject.csv *********

Branch3 jjj  kkk    lll  
Branch4 mmm  nnn    ooo  
#>

更新

根据您的评论 --


'the problem is the final output need: the Unicode Tab-delimited text with the additional columns.'


(($ComparedObjects -match '<=').InputObject) -replace ',', "`t" | 
ConvertFrom-Csv -Delimiter "`t" -Header Site, Dept, Office, Floor  | 
Export-Csv -Path 'D:\temp\ReferenceObject.csv' -Encoding Unicode -NoTypeInformation -Force
Import-Csv -Path 'D:\temp\ReferenceObject.csv'
# Results
<#
Site  Dept Office Floor
----  ---- ------ -----
Main0 aaa  bbb    ccc  
#>


(($ComparedObjects -match '=>').InputObject) -replace ',', "`t" | 
ConvertFrom-Csv -Delimiter "`t" -Header Site, Dept, Office, Floor  | 
Export-Csv -Path 'D:\temp\DifferenceObject.csv' -Encoding Unicode -NoTypeInformation -Force
Import-Csv -Path 'D:\temp\DifferenceObject.csv'
# Results
<#
Site    Dept Office Floor
----    ---- ------ -----
Branch3 jjj  kkk    lll  
Branch4 mmm  nnn    ooo  
#>

或者对于额外的列内容,您可以这样做...

$ComparedObjects -match '<=' | 
Select-Object -Property @{
    Name       = 'Site'
    Expression = {($PSItem.InputObject -split ',')[0]}
},
@{
    Name       = 'Dept'
    Expression = {($PSItem.InputObject -split ',')[1]}
},
@{
    Name       = 'Office'
    Expression = {($PSItem.InputObject -split ',')[2]}
},
@{
    Name       = 'Floor'
    Expression = {($PSItem.InputObject -split ',')[3]}
},
@{
    Name       = 'Label'
    Expression = {'Good'}
}, 
@{
    Name       = 'Placeholder'
    Expression = {0}
} |  
Export-Csv -Path 'D:\temp\ReferenceObject.csv' -Encoding Unicode -NoTypeInformation -Force
(Get-Content -Path 'D:\temp\ReferenceObject.csv') -replace '"','' -replace ',', "`t" | 
Set-Content -PassThru 'D:\temp\ReferenceObject.csv'
Import-Csv -Path 'D:\temp\ReferenceObject.csv' -Delimiter "`t" | 
Format-Table -AutoSize
# Results
<#
Site  Dept Office Floor Label Placeholder
----  ---- ------ ----- ----- -----------
Main0 aaa  bbb    ccc   Good  0 
#>


$ComparedObjects -match '=>' | 
Select-Object -Property @{
    Name       = 'Site'
    Expression = {($PSItem.InputObject -split ',')[0]}
},
@{
    Name       = 'Dept'
    Expression = {($PSItem.InputObject -split ',')[1]}
},
@{
    Name       = 'Office'
    Expression = {($PSItem.InputObject -split ',')[2]}
},
@{
    Name       = 'Floor'
    Expression = {($PSItem.InputObject -split ',')[3]}
},
@{
    Name       = 'Label'
    Expression = {'Good'}
}, 
@{
    Name       = 'Placeholder'
    Expression = {0}
} | 
Export-Csv -Path 'D:\temp\DifferenceObject.csv' -Encoding Unicode -NoTypeInformation -Force
(Get-Content -Path 'D:\temp\DifferenceObject.csv') -replace '"','' -replace ',', "`t" | 
Set-Content -PassThru 'D:\temp\DifferenceObject.csv'
Import-Csv -Path 'D:\temp\DifferenceObject.csv' -Delimiter "`t" | 
Format-Table -AutoSize
# Results
<#
Site    Dept Office Floor Label Placeholder
----    ---- ------ ----- ----- -----------
Branch3 jjj  kkk    lll   Good  0          
Branch4 mmm  nnn    ooo   Good  0 
#>