将对象中的多个元素与不同数组的另一个对象中的多个元素进行比较
Compare multiple elements in an object against multiple elements in another object of a different array
说 [假设],我有两个 .CSV 我正在比较以尝试查看我当前的哪些成员是原始成员...我写了一个嵌套的 ForEach-Object 比较每个 $name 和 $memberNumber反对所有其他对象的对象。它工作正常,但时间很长,特别是因为每个 CSV 都有成千上万个对象。还有其他方法可以解决这个问题吗?
Original_Members.csv
姓名,Member_Number
爱丽丝,1234
吉姆,4567
Current_Members.csv
爱丽丝,4599
吉姆,4567
$currentMembers = import-csv $home\Desktop\current_members.csv |
ForEach-Object {
$name = $_.Name
$memNum = $_."Member Number"
$ogMembers = import-csv $home\Desktop\original_members.csv" |
ForEach-Object {
If ($ogMembers.Name -eq $name -and $ogMembers."Member Number" -eq $memNum) {
$ogMember = "Yes"
}
Else {
$ogMember = "No"
}
}
[pscustomobject]@{
"Name"=$name
"Member Number"=$memNum
"Original Member?"=$ogMember
}
} |
select "Name","Member Number","Original Member?" |
Export-CSV "$home\Desktop\OG_Compare_$(get-date -uformat "%d%b%Y").csv" -Append -NoTypeInformation
假设您的两个文件如下所示:
Original_Members.csv
Name, Member_Number
Alice, 1234
Jim, 4567
Current_Members.csv
Name, Member_Number
Alice, 4599
Jim, 4567
您可以将原始成员名称存储在 System.Collections.Generic.HashSet<T>
for constant time lookups, instead of doing a linear search for each name. We can use System.Linq.Enumerable.ToHashSet
中以创建 string[]
个名称的哈希集。
然后我们可以使用 Where-Object
to filter current names by checking if the hashset contains the original name with System.Collections.Generic.HashSet<T>.Contains(T)
,这是一个 O(1) 方法。
$originalMembers = Import-Csv -Path .\Original_Members.csv
$currentMembers = Import-Csv -Path .\Current_Members.csv
$originalMembersLookup = [Linq.Enumerable]::ToHashSet(
[string[]]$originalMembers.Name,
[StringComparer]::CurrentCultureIgnoreCase
)
$currentMembers |
Where-Object {$originalMembersLookup.Contains($_.Name)}
这将输出原始成员的当前成员:
Name Member_Number
---- -------------
Alice 4599
Jim 4567
更新
根据评论中的要求,如果我们想同时检查 Name
和 Member_Number
,我们可以连接两个字符串以用于查找:
$originalMembers = Import-Csv -Path .\Original_Members.csv
$currentMembers = Import-Csv -Path .\Current_Members.csv
$originalMembersLookup = [Linq.Enumerable]::ToHashSet(
[string[]]($originalMembers |
ForEach-Object {
$_.Name + $_.Member_Number
}),
[StringComparer]::CurrentCultureIgnoreCase
)
$currentMembers |
Where-Object {$originalMembersLookup.Contains($_.Name + $_.Member_Number)}
现在只有 return:
Name Member_Number
---- -------------
Jim 4567
说 [假设],我有两个 .CSV 我正在比较以尝试查看我当前的哪些成员是原始成员...我写了一个嵌套的 ForEach-Object 比较每个 $name 和 $memberNumber反对所有其他对象的对象。它工作正常,但时间很长,特别是因为每个 CSV 都有成千上万个对象。还有其他方法可以解决这个问题吗?
Original_Members.csv
姓名,Member_Number
爱丽丝,1234
吉姆,4567
Current_Members.csv
爱丽丝,4599
吉姆,4567
$currentMembers = import-csv $home\Desktop\current_members.csv |
ForEach-Object {
$name = $_.Name
$memNum = $_."Member Number"
$ogMembers = import-csv $home\Desktop\original_members.csv" |
ForEach-Object {
If ($ogMembers.Name -eq $name -and $ogMembers."Member Number" -eq $memNum) {
$ogMember = "Yes"
}
Else {
$ogMember = "No"
}
}
[pscustomobject]@{
"Name"=$name
"Member Number"=$memNum
"Original Member?"=$ogMember
}
} |
select "Name","Member Number","Original Member?" |
Export-CSV "$home\Desktop\OG_Compare_$(get-date -uformat "%d%b%Y").csv" -Append -NoTypeInformation
假设您的两个文件如下所示:
Original_Members.csv
Name, Member_Number
Alice, 1234
Jim, 4567
Current_Members.csv
Name, Member_Number
Alice, 4599
Jim, 4567
您可以将原始成员名称存储在 System.Collections.Generic.HashSet<T>
for constant time lookups, instead of doing a linear search for each name. We can use System.Linq.Enumerable.ToHashSet
中以创建 string[]
个名称的哈希集。
然后我们可以使用 Where-Object
to filter current names by checking if the hashset contains the original name with System.Collections.Generic.HashSet<T>.Contains(T)
,这是一个 O(1) 方法。
$originalMembers = Import-Csv -Path .\Original_Members.csv
$currentMembers = Import-Csv -Path .\Current_Members.csv
$originalMembersLookup = [Linq.Enumerable]::ToHashSet(
[string[]]$originalMembers.Name,
[StringComparer]::CurrentCultureIgnoreCase
)
$currentMembers |
Where-Object {$originalMembersLookup.Contains($_.Name)}
这将输出原始成员的当前成员:
Name Member_Number
---- -------------
Alice 4599
Jim 4567
更新
根据评论中的要求,如果我们想同时检查 Name
和 Member_Number
,我们可以连接两个字符串以用于查找:
$originalMembers = Import-Csv -Path .\Original_Members.csv
$currentMembers = Import-Csv -Path .\Current_Members.csv
$originalMembersLookup = [Linq.Enumerable]::ToHashSet(
[string[]]($originalMembers |
ForEach-Object {
$_.Name + $_.Member_Number
}),
[StringComparer]::CurrentCultureIgnoreCase
)
$currentMembers |
Where-Object {$originalMembersLookup.Contains($_.Name + $_.Member_Number)}
现在只有 return:
Name Member_Number
---- -------------
Jim 4567