使用 Powershell 重组文本文件

Restructure text file using Powershell

我有一个结构如下所示的文本文件,我需要提取 主机名Stratum=xOffset=y 值转换为结构化格式,例如CSV。如果值满足特定阈值,我打算使用输出来写入 windows 事件日志。我的想法是,创建对象(例如主机名)并将层和偏移值添加为成员可以让我实现这一点,但我的 PowerShell 技能在这里让我失望了..



    ___________________________________________________________________________

    02/04/2020 08:11:00 : Started [TEST] Command Scrape

    Text I don't care about


    ___________________________________________________________________________

    Hostname_1 (192.168.1.254):

    assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
    version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
    processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
    precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
    refid=LOCAL(0),
    reftime=d495c32c.0e71eaf2  Mon, Jan  7 2013 13:57:00.056, poll=10,
    clock=d495c32c.cebd43bd  Mon, Jan  7 2013 13:57:00.807, state=4,
    offset=0.123, frequency=0.000, jitter=0.977, noise=0.977,
    stability=0.000, tai=0

    ___________________________________________________________________________

    Hostname_2 (10.10.1.1):

    assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
    version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
    processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
    precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
    refid=LOCAL(0),
    reftime=d495c32c.0e71eaf2  Mon, Jan  7 2013 13:57:00.056, poll=10,
    clock=d495c32c.cebd43bd  Mon, Jan  7 2013 13:57:00.807, state=4,
    offset=2.456, frequency=0.000, jitter=0.977, noise=0.977,
    stability=0.000, tai=0

    ___________________________________________________________________________

    Hostname_3 (10.10.1.2):
    ...

我发现我可以创建 CSV,如果我手动将数据重新格式化为密钥对(如下所示),使用 ConvertFrom-StringData 并输出到 CSV;

        (Get-Content 'file.txt' -Raw) -split '####' |
            ForEach-Object {
                $results = Convertform-StringData - StringData ($PSitem -replace '\n-\s+')
                New-Object PSObject -Property $results | Select-Object Hostname, Stratum, Offset
                } | Export-Csv 'file.csv' - NoTypeInformation

Hostname=Hostname_1
stratum=3
offset=-10.345
####
Hostname=Hostname_2
stratum=4
offset=-8.345

变成如下 CSV:

    "Hostname","Stratum","offset"
    "Hostname_1","3","-10.345"
    "Hostname_2","4","-8.345"

您可以使用以下代码执行此操作。

在您的示例中,文本块由一系列下划线分隔。如果在现实生活中有所不同,请相应地更改 -split '_{2,}'

$regex = '(?s)^\s*([\w_\d]+)\s+\(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\):.*stratum=(\d+).*offset=([\d.]+)'
$result = (Get-Content 'D:\file.txt' -Raw) -split '_{2,}' | Where-Object {$_ -match $regex} | ForEach-Object {  
    [PsCustomObject]@{
        'Hostname' = $matches[1]
        'Stratum'  = $matches[2]
        'Offset'   = $matches[3]
    }
}

# output to console
$result

#output to csv file
$result | Export-Csv -Path 'D:\file.csv' -NoTypeInformation

屏幕输出:

Hostname   Stratum Offset
--------   ------- ------
Hostname_1 4       0.123 
Hostname_2 4       2.456 

输出为 CSV:

"Hostname","Stratum","Offset"
"Hostname_1","4","0.123"
"Hostname_2","4","2.456"

正则表达式详细信息:

^                 Assert position at the beginning of the string
\s                Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
   *              Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(                 Match the regular expression below and capture its match into backreference number 1
   [\w_\d]        Match a single character present in the list below
                  A word character (letters, digits, etc.)
                  The character “_”
                  A single digit 0..9
      +           Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)                
\s                Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
   +              Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\(                Match the character “(” literally
\d                Match a single digit 0..9
   {1,3}          Between one and 3 times, as many times as possible, giving back as needed (greedy)
\.                Match the character “.” literally
\d                Match a single digit 0..9
   {1,3}          Between one and 3 times, as many times as possible, giving back as needed (greedy)
\.                Match the character “.” literally
\d                Match a single digit 0..9
   {1,3}          Between one and 3 times, as many times as possible, giving back as needed (greedy)
\.                Match the character “.” literally
\d                Match a single digit 0..9
   {1,3}          Between one and 3 times, as many times as possible, giving back as needed (greedy)
\)                Match the character “)” literally
:                 Match the character “:” literally
.                 Match any single character
   *              Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
stratum=          Match the characters “stratum=” literally
(                 Match the regular expression below and capture its match into backreference number 2
   \d             Match a single digit 0..9
      +           Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)                
.                 Match any single character
   *              Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
offset=           Match the characters “offset=” literally
(                 Match the regular expression below and capture its match into backreference number 3
   [\d.]          Match a single character present in the list below
                  A single digit 0..9
                  The character “.”
      +           Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)

太棒了,非常感谢。我进行了以下调整以适应主机名和偏移量中的非单词字符,并排除了偏移量值中的尾随逗号。

 $regex = '(?s)^\s*([\w_\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\d.]+)' 

现在是:

(?s)^\s*([\w|\W|\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\W|\d.][^,]