需要在 PowerShell 中获取两个模式之间的字符串

Need to get the string in between two patterns in PowerShell

我需要匹配一个模式,即 "Commodity Name" 并在模式“<dd>”“</dd>”.

之间的下一行中获取字符串

示例输入文件:

C:\Users\rpm\Desktop\sample.txt:133:    <dt>Commodity Name</dt>
C:\Users\rpm\Desktop\sample.txt:134:    <dd>Grocery</dd>
C:\Users\rpm\Desktop\sample.txt:136:    <dt>IP address</dt>
C:\Users\rpm\Desktop\sample.txt:137:    <dd>XXX.XXX.XXX.XXX port 8000</dd>
C:\Users\rpm\Desktop\sample.txt:144:    <dt>Commodity Serial #</dt>
C:\Users\rpm\Desktop\sample.txt:145:    <dd>0055500000</dd>
C:\Users\rpm\Desktop\sample.txt:147:    <dt>Client IP</dt>
C:\Users\rpm\Desktop\sample.txt:148:    <dd>xxx.xxx.xxx.xxx</dd>
C:\Users\rpm\Desktop\sample.txt:150:    <dt>Client Logged In As</dt>
C:\Users\rpm\Desktop\sample.txt:151:    <dd>rpm123</dd>
C:\Users\rpm\Desktop\sample.txt:153:    <dt>User is member of</dt>
C:\Users\rpm\Desktop\sample.txt:154:    <dd>BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user</dd>

需要匹配

等模式

并获取标签 <dd> & </dd>.

之间匹配模式下一行的值

期望的输出:

Grocery | XXX.XXX.XXX.XXX port 8000 | 0055500000 | xxx.xxx.xxx.xxx | rpm123 | BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user

我将开始创建一个 array 定义您的关键字:

$keywords = @(
    '<dt>Commodity Name</dt>'
    '<dt>IP address</dt>'
    '<dt>Commodity Serial #</dt>'
    '<dt>Client IP</dt>'
    '<dt>Client Logged In As</dt>'
    '<dt>User is member of</dt>'
)

现在您可以通过 | join 关键字与 Select-String cmdlet 一起使用:

$file = 'C:\Users\rpm\Desktop\sample.txt'
$content = Get-Content $file
$content | Select-String -Pattern ($keywords -join '|')

这将为您提供每个匹配关键字的行号。现在您可以遍历结果,按索引访问下一行并裁剪 <dd> pre 和 </dd> postifx:

ForEach-Object {
        [regex]::Match($content[$_.LineNumber], '<dd>(.+)</dd>').Groups[1].Value
    }

正则表达式:

输出:

Grocery
XXX.XXX.XXX.XXX port 8000
0055500000
xxx.xxx.xxx.xxx
rpm123
BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user

最后,您必须通过 | 加入结果以获得所需的输出。这是整个脚本:

$keywords = @(
    '<dt>Commodity Name</dt>'
    '<dt>IP address</dt>'
    '<dt>Commodity Serial #</dt>'
    '<dt>Client IP</dt>'
    '<dt>Client Logged In As</dt>'
    '<dt>User is member of</dt>'
)

$file = 'C:\Users\rpm\Desktop\sample.txt'
$content = Get-Content $file

($content | Select-String -Pattern ($keywords -join '|') | 
    ForEach-Object {
        [regex]::Match($content[$_.LineNumber], '<dd>(.+)</dd>').Groups[1].Value
    }) -join ' | '

输出:

Grocery | XXX.XXX.XXX.XXX port 8000 | 0055500000 | xxx.xxx.xxx.xxx | rpm123 | BP-RPM\COMD_CSO_ITM-AVAI_Def,BP-RPM\user