Powershell

Question

作为 Doug 建议的后续行动，关于我之前关于匿名文件的问题 ( ) 我需要将所有哈希表值保存在单个文件“tmp.txt”中以供进一步处理。示例：在使用如下字符串处理输入文件后：

<requestId>>qwerty-qwer12-qwer56</requestId>

tmp.txt 文件包含：

qwerty-qwer12-qwer56 : RequestId-1

这是完美的。问题是在处理许多字符串时，tmp.txt 文件中的对数比应有的多。在我下面的 tmp.txt 示例中，我应该看到 4 次“RequestId-x”，但实际上有 6 个。此外，当同一行上有 2 个或更多“匹配项”时，只有第一个是 updated/replaced .知道这些额外的行是从哪里来的吗？为什么脚本不继续检查到同一行的末尾？
这是我的测试代码：

$log = "C:\log.txt"
$tmp = "C:\tmp.txt"
Clear-Content $log
Clear-Content $tmp

@'
<requestId>qwerty-qwer12-qwer56</requestId>qwertykeyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</ABC reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>qwerty-qwer12-qwer56</requestId>abcde reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>1234qw-12qw12-12qw56</requestId>
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw</
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
'@ | Set-Content $log -Encoding UTF8

$requestId = @{
    Count   = 1
    Matches = @()
}
$keyId  = @{
    Count   = 1
    Matches = @()
}
$reportId  = @{
    Count   = 1
    Matches = @()
}

$output = switch -Regex -File $log {
    '(\w{6}-\w{6}-\w{6})' {
        if(!$requestId.matches.($matches.1))
        {
            $req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
            $requestId.count++
            $req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
        }
        $_ -replace $matches.1,$requestId.matches.($matches.1)               
    }
    'keyId>(\w{70})</' {
        if(!$keyId.matches.($matches.1))
        {
            $kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"} 
            $keyId.count++
            $kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
        }
        $_ -replace $matches.1,$keyId.matches.($matches.1)        
    }
    'reportId>(\w{19})</msg:reportId>' {
        if(!$reportId.matches.($matches.1))
        {
            $repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
            $reportId.count++
            $repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
        }
        $_ -replace $matches.1,$reportId.matches.($matches.1)
    } 
    default {$_}
}

$output | Set-Content $log -Encoding UTF8

Get-Content $log
Get-Content $tmp

Answer 1

如果您不关心它们被发现的顺序，我认为如果您不想要重复项，您就不会关心这些顺序，那么最后将它们全部导出即可。我仍然会以“对象”形式保留它们，以便您可以轻松 import/export 它们。 Csv 将是数据的理想候选者。

$requestId,$keyid,$reportid | Foreach-Object {
    foreach($key in $_.matches.keys)
    {
        [PSCustomObject]@{
            Original    = $key
            Replacement = $_.matches.$key
        }
    }
}

本例中输出到控制台的数据

Original                                                               Replacement
--------                                                               -----------
qwerty-qwer12-qwer56                                                   RequestId-1
zxcvbn-zxcv12-zxcv56                                                   RequestId-2
1234qw-12qw12-12qw56                                                   RequestId-3
Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5 keyId-1    
Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw keyId-2    
poGd56Hnm9q3Dfer6Jh                                                    Report-1   
plmkjh8765FGH4rt6As                                                    Report-2

只需将其输入 Export-Csv

$requestId,$keyid,$reportid | Foreach-Object {
    foreach($key in $_.matches.keys)
    {
        [PSCustomObject]@{
            Original    = $key
            Replacement = $_.matches.$key
        }
    }
} | Export-Csv $tmp -NoTypeInformation

Powershell - 将散列 table 存储在文件中并读取其内容

Powershell - Store hash table in file and read its content

hashtable