使用 RegEx 对对象组配置进行分组
Grouping object-group configuration using RegEx
我有一个来自 Cisco ASA 的配置,我需要编写一个 Python RegEx 来捕获对象组中的所有内容并将它们分组以供进一步处理。
例如:
object-group network FTP
description FTP Access
network-object host BCD1
network-object host BCD2
object-group network NTP
description NTP Access
network-object host ABC1
network-object host ABC2
network-object host ABC3
object-group service sample_service tcp
description Ports 1 2 3
port-object range 80 81
port-object eq pop3
port-object eq imap4
port-object range 443 444
object-group service 8080 tcp
description Servers
最终结果应该是这样的:
Group 1: object-group network FTP
description FTP Access
network-object host BCD1
network-object host BCD2
Group 2: object-group network NTP
description NTP Access
network-object host ABC1
network-object host ABC2
etc.
正如我所说的,我很不擅长这个,但我试图想出一些办法,但结果很糟糕
(object-group\s[^!]*)object or (object-group[^!]*)
都失败了
您可以使用以 unroll-the-loop 技巧编写的正则表达式:
\bobject-group\b\S*(?:\s+(?!object-group\b)\S*)*
参见 regex demo。它与 (?s)object-group(?:(?!\bobject-group\b).)*
或 (?s)object-group.*?(?=\bobject-group\b|$)
基本相同,但效率更高。
解释:
\bobject-group\b
- 字符的文字序列 object-group
(由于 \b
字边界,整个字)
\S*
- 零个或多个 non-whitespace 个符号
(?:\s+(?!object-group\b)\S*)*
- 零个或多个......
\s+(?!object-group\b)
- 1 个或多个空格符号后面没有跟 object-group
整个单词
\S*
- 零个或多个 non-whitespace 个符号。
import re
p = re.compile(r'\bobject-group\b\S*(?:\s+(?!object-group\b)\S*)*')
test_str = "object-group network FTP\n description FTP Access\n network-object host BCD1\n network-object host BCD2\nobject-group network NTP\n description NTP Access\n network-object host ABC1\n network-object host ABC2\n network-object host ABC3\nobject-group service sample_service tcp\n description Ports 1 2 3\n port-object range 80 81\n port-object eq pop3\n port-object eq imap4\n port-object range 443 444\nobject-group service 8080 tcp\n description Servers"
print(re.findall(p, test_str))
您不需要复杂、难以理解的正则表达式来执行此操作。只需遍历以 object-group
开头的行上的文件中断并构建列表字典。
您可以使用 list
中的 itertools.groupby()
or a defaultdict
来完成。我更喜欢后者,它会给你一个对进一步处理有用的字典:
from collections import defaultdict
object_groups = defaultdict(list)
key = 0
with open('cisco.cfg') as f:
for line in f:
if line.startswith('object-group'):
key += 1
object_groups[key].append(line.strip())
from pprint import pprint
pprint(object_groups.items())
假设您的示例输入,输出将是:
[(1,
['object-group network FTP',
'description FTP Access',
'network-object host BCD1',
'network-object host BCD2']),
(2,
['object-group network NTP',
'description NTP Access',
'network-object host ABC1',
'network-object host ABC2',
'network-object host ABC3']),
(3,
['object-group service sample_service tcp',
'description Ports 1 2 3',
'port-object range 80 81',
'port-object eq pop3',
'port-object eq imap4',
'port-object range 443 444']),
(4, ['object-group service 8080 tcp', 'description Servers'])]
此外,您可以改为使用对象组标识符作为键:
from collections import defaultdict
object_groups = defaultdict(list)
key = None
with open('cisco.cfg') as f:
for line in f:
if line.startswith('object-group'):
# key = line.strip() # the whole line
key = line.strip().partition(' ')[-1] # just the object group definition
else:
object_groups[key].append(line.strip())
from pprint import pprint
pprint(object_groups.items())
这将创建一个类似的字典,但键为 'network FTP'
、'network NTP'
、'service sample_service tcp'
等
我有一个来自 Cisco ASA 的配置,我需要编写一个 Python RegEx 来捕获对象组中的所有内容并将它们分组以供进一步处理。
例如:
object-group network FTP
description FTP Access
network-object host BCD1
network-object host BCD2
object-group network NTP
description NTP Access
network-object host ABC1
network-object host ABC2
network-object host ABC3
object-group service sample_service tcp
description Ports 1 2 3
port-object range 80 81
port-object eq pop3
port-object eq imap4
port-object range 443 444
object-group service 8080 tcp
description Servers
最终结果应该是这样的:
Group 1: object-group network FTP
description FTP Access
network-object host BCD1
network-object host BCD2
Group 2: object-group network NTP
description NTP Access
network-object host ABC1
network-object host ABC2
etc.
正如我所说的,我很不擅长这个,但我试图想出一些办法,但结果很糟糕
(object-group\s[^!]*)object or (object-group[^!]*)
都失败了
您可以使用以 unroll-the-loop 技巧编写的正则表达式:
\bobject-group\b\S*(?:\s+(?!object-group\b)\S*)*
参见 regex demo。它与 (?s)object-group(?:(?!\bobject-group\b).)*
或 (?s)object-group.*?(?=\bobject-group\b|$)
基本相同,但效率更高。
解释:
\bobject-group\b
- 字符的文字序列object-group
(由于\b
字边界,整个字)\S*
- 零个或多个 non-whitespace 个符号(?:\s+(?!object-group\b)\S*)*
- 零个或多个......\s+(?!object-group\b)
- 1 个或多个空格符号后面没有跟object-group
整个单词\S*
- 零个或多个 non-whitespace 个符号。
import re
p = re.compile(r'\bobject-group\b\S*(?:\s+(?!object-group\b)\S*)*')
test_str = "object-group network FTP\n description FTP Access\n network-object host BCD1\n network-object host BCD2\nobject-group network NTP\n description NTP Access\n network-object host ABC1\n network-object host ABC2\n network-object host ABC3\nobject-group service sample_service tcp\n description Ports 1 2 3\n port-object range 80 81\n port-object eq pop3\n port-object eq imap4\n port-object range 443 444\nobject-group service 8080 tcp\n description Servers"
print(re.findall(p, test_str))
您不需要复杂、难以理解的正则表达式来执行此操作。只需遍历以 object-group
开头的行上的文件中断并构建列表字典。
您可以使用 list
中的 itertools.groupby()
or a defaultdict
来完成。我更喜欢后者,它会给你一个对进一步处理有用的字典:
from collections import defaultdict
object_groups = defaultdict(list)
key = 0
with open('cisco.cfg') as f:
for line in f:
if line.startswith('object-group'):
key += 1
object_groups[key].append(line.strip())
from pprint import pprint
pprint(object_groups.items())
假设您的示例输入,输出将是:
[(1, ['object-group network FTP', 'description FTP Access', 'network-object host BCD1', 'network-object host BCD2']), (2, ['object-group network NTP', 'description NTP Access', 'network-object host ABC1', 'network-object host ABC2', 'network-object host ABC3']), (3, ['object-group service sample_service tcp', 'description Ports 1 2 3', 'port-object range 80 81', 'port-object eq pop3', 'port-object eq imap4', 'port-object range 443 444']), (4, ['object-group service 8080 tcp', 'description Servers'])]
此外,您可以改为使用对象组标识符作为键:
from collections import defaultdict
object_groups = defaultdict(list)
key = None
with open('cisco.cfg') as f:
for line in f:
if line.startswith('object-group'):
# key = line.strip() # the whole line
key = line.strip().partition(' ')[-1] # just the object group definition
else:
object_groups[key].append(line.strip())
from pprint import pprint
pprint(object_groups.items())
这将创建一个类似的字典,但键为 'network FTP'
、'network NTP'
、'service sample_service tcp'
等