Python 系统日志的正则表达式解析
Python regex parsing of syslog
我有一个这种格式的系统日志文件。
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Application Version: 8.44.0
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Run on system: host
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Running as user: SYSTEM
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: User has admin rights: yes
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Start Time: 2016-03-07 13:44:55
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: IP Address: 10.10.10.10
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: CPU Count: 1
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: System Type: Server
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: System Uptime: 18.10 days
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: MODULE: InitHead MESSAGE: => Reading signature and hash files ...
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: file-type-signatures.cfg initialized with 80 values.
Mar 7 13:44:56 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: signatures/filename-characteristics.dat initialized with 2778 values.
Mar 7 13:44:56 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: signatures/keywords.dat initialized with 63 values.
Some logs ...
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: Results: MODULE: Report MESSAGE: Results: 0 Alarms, 0 Warnings, 131 Notices, 2 Errors
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: Begin Time: 2016-03-07 13:44:55
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: End Time: 2016-03-07 17:42:07
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: Scan took 3 hours 57 mins 11 secs
如何提取"Application Version"、"Run on system"、"User has admin rights"、"Start Time"、"IP Address"、"CPU Count"、"System Type" , "System Uptime", "End Time", "Alarms", "Warnings", "Notices", "Errors" 使用 Python?
实际上我是 Python 的新手,所以我真的不知道该怎么做。但我设法创建了一个名为 finder()
的函数
def finder(fname,str):
with open(fname, "r") as hand:
for line in hand:
line = line.rstrip()
if re.search(str, line):
return line
为了获取带有 IP 地址的行,我将使用
来调用它
finder("file path","MESSAGE: IP Address")
这将打印整行,我需要帮助才能仅获取 ipaddress 部分,
以及其他行中的其他信息。
请在检查代码之前检查下面 links。对你有很大帮助。
- re module - 使用的模块。这个 link 给出了很好的解释和例子
- Python Regex Tester - 在这里您可以测试您的正则表达式和 Python 可用的正则表达式相关函数。我用同样的方法来测试我在下面使用的正则表达式:
内联注释代码
import re
fo = open("out.txt", "r")
#The information we need to collect.
info_list =["Application Version", "Run on system", "User has admin rights", "Start Time", "IP Address", "CPU Count", "System Type", "System Uptime", "End Time", "Results","Begin Time"]
for line in fo:
for srch_pat in info_list:
#First will search if the inforamtion we need is present in line or not.
if srch_pat in line:
#This will get the exact information. For e.g, version number in case of Application Version
regex = re.compile(r'MESSAGE:\s+%s:\s+(.*)'%srch_pat)
m = regex.search(line)
if "Results" in srch_pat:
#For result, this regex will get the required info
result_regex = re.search(r'(\d+)\s+Alarms,\s+(\d+)\s+Warnings,\s+(\d+)\s+Notices,\s+(\d+)\s+Errors',m.group(1))
print 'Alarms - ',result_regex.group(1)
print 'Warnings - ',result_regex.group(2)
print 'Notices - ',result_regex.group(3)
print 'Errors - ',result_regex.group(4)
else:
print srch_pat,'-',m.group(1)
输出
C:\Users\dinesh_pundkar\Desktop>python a.py
Application Version - 8.44.0
Run on system - host
User has admin rights - yes
Start Time - 2016-03-07 13:44:55
IP Address - 10.10.10.10
CPU Count - 1
System Type - Server
System Uptime - 18.10 days
Alarms - 0
Warnings - 0
Notices - 131
Errors - 2
Begin Time - 2016-03-07 13:44:55
End Time - 2016-03-07 17:42:07
我有一个这种格式的系统日志文件。
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Application Version: 8.44.0
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Run on system: host
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Running as user: SYSTEM
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: User has admin rights: yes
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: Start Time: 2016-03-07 13:44:55
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: IP Address: 10.10.10.10
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: CPU Count: 1
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: System Type: Server
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Info: MODULE: Startup MESSAGE: System Uptime: 18.10 days
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: MODULE: InitHead MESSAGE: => Reading signature and hash files ...
Mar 7 13:44:55 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: file-type-signatures.cfg initialized with 80 values.
Mar 7 13:44:56 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: signatures/filename-characteristics.dat initialized with 2778 values.
Mar 7 13:44:56 host.domain.example.net/10.10.10.10 Application: Notice: MODULE: Init MESSAGE: signatures/keywords.dat initialized with 63 values.
Some logs ...
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: Results: MODULE: Report MESSAGE: Results: 0 Alarms, 0 Warnings, 131 Notices, 2 Errors
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: Begin Time: 2016-03-07 13:44:55
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: End Time: 2016-03-07 17:42:07
Mar 7 17:42:08 host.domain.example.net/10.10.10.10 Application: End: MODULE: Report MESSAGE: Scan took 3 hours 57 mins 11 secs
如何提取"Application Version"、"Run on system"、"User has admin rights"、"Start Time"、"IP Address"、"CPU Count"、"System Type" , "System Uptime", "End Time", "Alarms", "Warnings", "Notices", "Errors" 使用 Python?
实际上我是 Python 的新手,所以我真的不知道该怎么做。但我设法创建了一个名为 finder()
的函数def finder(fname,str):
with open(fname, "r") as hand:
for line in hand:
line = line.rstrip()
if re.search(str, line):
return line
为了获取带有 IP 地址的行,我将使用
来调用它 finder("file path","MESSAGE: IP Address")
这将打印整行,我需要帮助才能仅获取 ipaddress 部分, 以及其他行中的其他信息。
请在检查代码之前检查下面 links。对你有很大帮助。
- re module - 使用的模块。这个 link 给出了很好的解释和例子
- Python Regex Tester - 在这里您可以测试您的正则表达式和 Python 可用的正则表达式相关函数。我用同样的方法来测试我在下面使用的正则表达式:
内联注释代码
import re
fo = open("out.txt", "r")
#The information we need to collect.
info_list =["Application Version", "Run on system", "User has admin rights", "Start Time", "IP Address", "CPU Count", "System Type", "System Uptime", "End Time", "Results","Begin Time"]
for line in fo:
for srch_pat in info_list:
#First will search if the inforamtion we need is present in line or not.
if srch_pat in line:
#This will get the exact information. For e.g, version number in case of Application Version
regex = re.compile(r'MESSAGE:\s+%s:\s+(.*)'%srch_pat)
m = regex.search(line)
if "Results" in srch_pat:
#For result, this regex will get the required info
result_regex = re.search(r'(\d+)\s+Alarms,\s+(\d+)\s+Warnings,\s+(\d+)\s+Notices,\s+(\d+)\s+Errors',m.group(1))
print 'Alarms - ',result_regex.group(1)
print 'Warnings - ',result_regex.group(2)
print 'Notices - ',result_regex.group(3)
print 'Errors - ',result_regex.group(4)
else:
print srch_pat,'-',m.group(1)
输出
C:\Users\dinesh_pundkar\Desktop>python a.py
Application Version - 8.44.0
Run on system - host
User has admin rights - yes
Start Time - 2016-03-07 13:44:55
IP Address - 10.10.10.10
CPU Count - 1
System Type - Server
System Uptime - 18.10 days
Alarms - 0
Warnings - 0
Notices - 131
Errors - 2
Begin Time - 2016-03-07 13:44:55
End Time - 2016-03-07 17:42:07