使用 Powershell 重组文本文件
Restructure text file using Powershell
我有一个结构如下所示的文本文件,我需要提取 主机名、Stratum=x 和 Offset=y 值转换为结构化格式,例如CSV。如果值满足特定阈值,我打算使用输出来写入 windows 事件日志。我的想法是,创建对象(例如主机名)并将层和偏移值添加为成员可以让我实现这一点,但我的 PowerShell 技能在这里让我失望了..
___________________________________________________________________________
02/04/2020 08:11:00 : Started [TEST] Command Scrape
Text I don't care about
___________________________________________________________________________
Hostname_1 (192.168.1.254):
assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
refid=LOCAL(0),
reftime=d495c32c.0e71eaf2 Mon, Jan 7 2013 13:57:00.056, poll=10,
clock=d495c32c.cebd43bd Mon, Jan 7 2013 13:57:00.807, state=4,
offset=0.123, frequency=0.000, jitter=0.977, noise=0.977,
stability=0.000, tai=0
___________________________________________________________________________
Hostname_2 (10.10.1.1):
assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
refid=LOCAL(0),
reftime=d495c32c.0e71eaf2 Mon, Jan 7 2013 13:57:00.056, poll=10,
clock=d495c32c.cebd43bd Mon, Jan 7 2013 13:57:00.807, state=4,
offset=2.456, frequency=0.000, jitter=0.977, noise=0.977,
stability=0.000, tai=0
___________________________________________________________________________
Hostname_3 (10.10.1.2):
...
我发现我可以创建 CSV,如果我手动将数据重新格式化为密钥对(如下所示),使用 ConvertFrom-StringData 并输出到 CSV;
(Get-Content 'file.txt' -Raw) -split '####' |
ForEach-Object {
$results = Convertform-StringData - StringData ($PSitem -replace '\n-\s+')
New-Object PSObject -Property $results | Select-Object Hostname, Stratum, Offset
} | Export-Csv 'file.csv' - NoTypeInformation
Hostname=Hostname_1
stratum=3
offset=-10.345
####
Hostname=Hostname_2
stratum=4
offset=-8.345
变成如下 CSV:
"Hostname","Stratum","offset"
"Hostname_1","3","-10.345"
"Hostname_2","4","-8.345"
您可以使用以下代码执行此操作。
在您的示例中,文本块由一系列下划线分隔。如果在现实生活中有所不同,请相应地更改 -split '_{2,}'
。
$regex = '(?s)^\s*([\w_\d]+)\s+\(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\):.*stratum=(\d+).*offset=([\d.]+)'
$result = (Get-Content 'D:\file.txt' -Raw) -split '_{2,}' | Where-Object {$_ -match $regex} | ForEach-Object {
[PsCustomObject]@{
'Hostname' = $matches[1]
'Stratum' = $matches[2]
'Offset' = $matches[3]
}
}
# output to console
$result
#output to csv file
$result | Export-Csv -Path 'D:\file.csv' -NoTypeInformation
屏幕输出:
Hostname Stratum Offset
-------- ------- ------
Hostname_1 4 0.123
Hostname_2 4 2.456
输出为 CSV:
"Hostname","Stratum","Offset"
"Hostname_1","4","0.123"
"Hostname_2","4","2.456"
正则表达式详细信息:
^ Assert position at the beginning of the string
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 1
[\w_\d] Match a single character present in the list below
A word character (letters, digits, etc.)
The character “_”
A single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\( Match the character “(” literally
\d Match a single digit 0..9
{1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
\d Match a single digit 0..9
{1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
\d Match a single digit 0..9
{1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
\d Match a single digit 0..9
{1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy)
\) Match the character “)” literally
: Match the character “:” literally
. Match any single character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
stratum= Match the characters “stratum=” literally
( Match the regular expression below and capture its match into backreference number 2
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
. Match any single character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
offset= Match the characters “offset=” literally
( Match the regular expression below and capture its match into backreference number 3
[\d.] Match a single character present in the list below
A single digit 0..9
The character “.”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
太棒了,非常感谢。我进行了以下调整以适应主机名和偏移量中的非单词字符,并排除了偏移量值中的尾随逗号。
$regex = '(?s)^\s*([\w_\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\d.]+)'
现在是:
(?s)^\s*([\w|\W|\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\W|\d.][^,]
我有一个结构如下所示的文本文件,我需要提取 主机名、Stratum=x 和 Offset=y 值转换为结构化格式,例如CSV。如果值满足特定阈值,我打算使用输出来写入 windows 事件日志。我的想法是,创建对象(例如主机名)并将层和偏移值添加为成员可以让我实现这一点,但我的 PowerShell 技能在这里让我失望了..
___________________________________________________________________________
02/04/2020 08:11:00 : Started [TEST] Command Scrape
Text I don't care about
___________________________________________________________________________
Hostname_1 (192.168.1.254):
assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
refid=LOCAL(0),
reftime=d495c32c.0e71eaf2 Mon, Jan 7 2013 13:57:00.056, poll=10,
clock=d495c32c.cebd43bd Mon, Jan 7 2013 13:57:00.807, state=4,
offset=0.123, frequency=0.000, jitter=0.977, noise=0.977,
stability=0.000, tai=0
___________________________________________________________________________
Hostname_2 (10.10.1.1):
assID=0 status=0544 leap_none, sync_local_proto, 4 events, event_peer/strat_chg,
version="ntpd 4.2.2p1@1.1570-o Tue May 19 13:57:55 UTC 2009 (1)",
processor="x86_64", system="Linux/2.6.18-164.el5", leap=00, stratum=4,
precision=-10, rootdelay=0.000, rootdispersion=11.974, peer=59475,
refid=LOCAL(0),
reftime=d495c32c.0e71eaf2 Mon, Jan 7 2013 13:57:00.056, poll=10,
clock=d495c32c.cebd43bd Mon, Jan 7 2013 13:57:00.807, state=4,
offset=2.456, frequency=0.000, jitter=0.977, noise=0.977,
stability=0.000, tai=0
___________________________________________________________________________
Hostname_3 (10.10.1.2):
...
我发现我可以创建 CSV,如果我手动将数据重新格式化为密钥对(如下所示),使用 ConvertFrom-StringData 并输出到 CSV;
(Get-Content 'file.txt' -Raw) -split '####' |
ForEach-Object {
$results = Convertform-StringData - StringData ($PSitem -replace '\n-\s+')
New-Object PSObject -Property $results | Select-Object Hostname, Stratum, Offset
} | Export-Csv 'file.csv' - NoTypeInformation
Hostname=Hostname_1
stratum=3
offset=-10.345
####
Hostname=Hostname_2
stratum=4
offset=-8.345
变成如下 CSV:
"Hostname","Stratum","offset"
"Hostname_1","3","-10.345"
"Hostname_2","4","-8.345"
您可以使用以下代码执行此操作。
在您的示例中,文本块由一系列下划线分隔。如果在现实生活中有所不同,请相应地更改 -split '_{2,}'
。
$regex = '(?s)^\s*([\w_\d]+)\s+\(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\):.*stratum=(\d+).*offset=([\d.]+)'
$result = (Get-Content 'D:\file.txt' -Raw) -split '_{2,}' | Where-Object {$_ -match $regex} | ForEach-Object {
[PsCustomObject]@{
'Hostname' = $matches[1]
'Stratum' = $matches[2]
'Offset' = $matches[3]
}
}
# output to console
$result
#output to csv file
$result | Export-Csv -Path 'D:\file.csv' -NoTypeInformation
屏幕输出:
Hostname Stratum Offset -------- ------- ------ Hostname_1 4 0.123 Hostname_2 4 2.456
输出为 CSV:
"Hostname","Stratum","Offset" "Hostname_1","4","0.123" "Hostname_2","4","2.456"
正则表达式详细信息:
^ Assert position at the beginning of the string \s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.) * Between zero and unlimited times, as many times as possible, giving back as needed (greedy) ( Match the regular expression below and capture its match into backreference number 1 [\w_\d] Match a single character present in the list below A word character (letters, digits, etc.) The character “_” A single digit 0..9 + Between one and unlimited times, as many times as possible, giving back as needed (greedy) ) \s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.) + Between one and unlimited times, as many times as possible, giving back as needed (greedy) \( Match the character “(” literally \d Match a single digit 0..9 {1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy) \. Match the character “.” literally \d Match a single digit 0..9 {1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy) \. Match the character “.” literally \d Match a single digit 0..9 {1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy) \. Match the character “.” literally \d Match a single digit 0..9 {1,3} Between one and 3 times, as many times as possible, giving back as needed (greedy) \) Match the character “)” literally : Match the character “:” literally . Match any single character * Between zero and unlimited times, as many times as possible, giving back as needed (greedy) stratum= Match the characters “stratum=” literally ( Match the regular expression below and capture its match into backreference number 2 \d Match a single digit 0..9 + Between one and unlimited times, as many times as possible, giving back as needed (greedy) ) . Match any single character * Between zero and unlimited times, as many times as possible, giving back as needed (greedy) offset= Match the characters “offset=” literally ( Match the regular expression below and capture its match into backreference number 3 [\d.] Match a single character present in the list below A single digit 0..9 The character “.” + Between one and unlimited times, as many times as possible, giving back as needed (greedy) )
太棒了,非常感谢。我进行了以下调整以适应主机名和偏移量中的非单词字符,并排除了偏移量值中的尾随逗号。
$regex = '(?s)^\s*([\w_\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\d.]+)'
现在是:
(?s)^\s*([\w|\W|\d]+)\s+(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}):.*stratum=(\d+).*offset=([\W|\d.][^,]