Windows Powershell:如何比较两个 CSV 文件并输出仅在其中一个文件中但不在两个文件中的行

Windows Powershell: How to compare two CSV files and output the rows that are just in either of the file but not in both

我有一个文件first.csv

name,surname,height,city,county,state,zipCode
John,Doe,120,jefferson,Riverside,NJ,8075
Jack,Yan,220,Phila,Riverside,PA,9119
Jill,Fan,120,jefferson,Riverside,NJ,8075
Steve,Tan,220,Phila,Riverside,PA,9119
Alpha,Fan,120,jefferson,Riverside,NJ,8075

second.csv

name,surname,height,city,county,state,zipCode
John,Doe,120,jefferson,Riverside,NJ,8075
Jack,Yan,220,Phila,Riverside,PA,9119
Jill,Fan,120,jefferson,Riverside,NJ,8075
Steve,Tan,220,Phila,Riverside,PA,9119
Bravo,Tan,220,Phila,Riverside,PA,9119

我想比较 first.csvsecond.csv 文件的行并输出在 first.csvsecond.csv 中,但不在两者中。

所以output.csv应该有

Alpha,Fan,120,jefferson,Riverside,NJ,8075
Bravo,Tan,220,Phila,Riverside,PA,9119

有很多类似的问题,但输出的结果不是我想要的。

谢谢

$filea = Import-Csv C:\Powershell\TestCSVs\group1.csv
$fileb = Import-Csv C:\Powershell\TestCSVs\group2.csv

Compare-Object $filea $fileb -Property name, surname, height, city, county, state, zipCode | Select-Object name, surname, height, city, county, state, zipCode | export-csv C:\Powershell\TestCSVs\out.csv -NoTypeInformation

我在此处使用所有字段进行比较和排序,但您可以指定要用于匹配行的唯一值。

输出

"name","surname","height","city","county","state","zipCode"     
"Bravo","Tan","220","Phila","Riverside","PA","9119"             
"Alpha","Fan","120","jefferson","Riverside","NJ","8075"

从两个列表中获取 symmetric difference(不相关的所有内容)实际上是比较对象时非常常见的用法。 因此,我添加了这个功能(#30) to the Join-Object script/Join-Object Module (see also: In Powershell, what's the best way to join two tables into one?)。

对于这个具体问题:

PS C:\> Import-Csv .\First.csv |OuterJoin (Import-Csv .\Second.csv) |Format-Table

name  surname height city      county    state zipCode
----  ------- ------ ----      ------    ----- -------
Alpha Fan     120    jefferson Riverside NJ    8075
Bravo Tan     220    Phila     Riverside PA    9119