解析以特定模式组织的文件
Parse file organised in a certain pattern
f是一个文件,如下所示:
+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.1
+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http
IP Address: 192.168.1.2
+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.3
+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL
IP Address: 192.168.1.4
+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH
IP Address: 192.168.1.5
+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.6
预期输出为:
These hosts have Apache services:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.6
我试过的代码:
for service in f:
if "product: Apache httpd" in service:
for host in f:
if "IP Address: " in host:
print(host[5:], service)
它只是给了我所有的 IP 地址,而不是安装了 Apache 的特定主机。
我怎样才能得到预期的输出?
也许是这样的。
为了便于说明,我已经内联了数据,但它也可以来自文件。
此外,我们会先收集每个主机的所有数据,以防您还需要一些其他信息,然后打印出所需的信息。这意味着 info_by_ip
看起来大致像
{'192.168.1.1': {'Port Number': '80', 'product': 'Apache httpd'},
'192.168.1.2': {'Port Number': '80', 'product': 'Apache http'},
'192.168.1.3': {'Port Number': '80', 'product': 'Apache httpd'},
'192.168.1.4': {'Port Number': '3306', 'product': 'MySQL'},
'192.168.1.5': {'Port Number': '22', 'product': 'Open SSH'},
'192.168.1.6': {'Port Number': '80', 'product': 'Apache httpd'}}
.
代码:
import collections
data = """
+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd
+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http
+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd
+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL
+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH
+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
"""
ip = None # Current IP address
# A defaultdict lets us conveniently add per-IP data without having to
# create the inner dicts explicitly:
info_by_ip = collections.defaultdict(dict)
for line in data.splitlines(): # replace with `for line in file:` for file purposes
if line.startswith('+++++'): # Seems like an IP address separator
ip = line.strip('+') # Remove + signs from both ends
continue # Skip to next line
if ':' in line: # If the line contains a colon,
key, value = line.split(':', 1) # ... split by it,
info_by_ip[ip][key.strip()] = value.strip() # ... and add to this IP's dict.
for ip, info in info_by_ip.items():
if info.get('product') == 'Apache httpd':
print(ip)
您可以使用 +++++
作为分隔符并使用以下代码获取所需的 ip。
with open('ip.txt', 'r') as fileReadObj:
rows = fileReadObj.read()
text_lines = rows.split('+++++')
for i, row in enumerate(text_lines):
if 'Apache' in str(row):
print(text_lines[i - 1])
你也可以试试这个:
apaches = []
with open('ips.txt') as f:
sections = f.read().split('\n\n')
for section in sections:
_, _, _, product, ip = section.split('\n')
_, product_type = product.split(':')
_, address = ip.split(':')
if product_type.strip().startswith('Apache'):
apaches.append(address.strip())
print('These hosts have Apache services:\n%s' % '\n'.join(apaches))
哪些输出:
These hosts have Apache services:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.6
解释:
with open(filename,'r') as fobj: # Open the file as read only
search_string = fobj.read() # Read file into string
print('These hosts have Apache services:\n\n')
# Split string by search term
for string_piece in search_string.split('Apache'):
# Split string to isolate IP and count up/back 2
ip_addr = string_piece.split('+++++')[-2]
print(ip_addr)
压缩:
with open(filename,'r') as fobj:
print('These hosts have Apache services:\n\n')
for string_piece in fobj.read().split('Apache'):
print('{}\n'.format(string_piece.split('+++++')[-2]))
f是一个文件,如下所示:
+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.1
+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http
IP Address: 192.168.1.2
+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.3
+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL
IP Address: 192.168.1.4
+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH
IP Address: 192.168.1.5
+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
IP Address: 192.168.1.6
预期输出为:
These hosts have Apache services:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.6
我试过的代码:
for service in f:
if "product: Apache httpd" in service:
for host in f:
if "IP Address: " in host:
print(host[5:], service)
它只是给了我所有的 IP 地址,而不是安装了 Apache 的特定主机。
我怎样才能得到预期的输出?
也许是这样的。 为了便于说明,我已经内联了数据,但它也可以来自文件。
此外,我们会先收集每个主机的所有数据,以防您还需要一些其他信息,然后打印出所需的信息。这意味着 info_by_ip
看起来大致像
{'192.168.1.1': {'Port Number': '80', 'product': 'Apache httpd'},
'192.168.1.2': {'Port Number': '80', 'product': 'Apache http'},
'192.168.1.3': {'Port Number': '80', 'product': 'Apache httpd'},
'192.168.1.4': {'Port Number': '3306', 'product': 'MySQL'},
'192.168.1.5': {'Port Number': '22', 'product': 'Open SSH'},
'192.168.1.6': {'Port Number': '80', 'product': 'Apache httpd'}}
.
代码:
import collections
data = """
+++++192.168.1.1+++++
Port Number: 80
......
product: Apache httpd
+++++192.168.1.2+++++
Port Number: 80
......
product: Apache http
+++++192.168.1.3+++++
Port Number: 80
......
product: Apache httpd
+++++192.168.1.4+++++
Port Number: 3306
......
product: MySQL
+++++192.168.1.5+++++
Port Number: 22
......
product: Open SSH
+++++192.168.1.6+++++
Port Number: 80
......
product: Apache httpd
"""
ip = None # Current IP address
# A defaultdict lets us conveniently add per-IP data without having to
# create the inner dicts explicitly:
info_by_ip = collections.defaultdict(dict)
for line in data.splitlines(): # replace with `for line in file:` for file purposes
if line.startswith('+++++'): # Seems like an IP address separator
ip = line.strip('+') # Remove + signs from both ends
continue # Skip to next line
if ':' in line: # If the line contains a colon,
key, value = line.split(':', 1) # ... split by it,
info_by_ip[ip][key.strip()] = value.strip() # ... and add to this IP's dict.
for ip, info in info_by_ip.items():
if info.get('product') == 'Apache httpd':
print(ip)
您可以使用 +++++
作为分隔符并使用以下代码获取所需的 ip。
with open('ip.txt', 'r') as fileReadObj:
rows = fileReadObj.read()
text_lines = rows.split('+++++')
for i, row in enumerate(text_lines):
if 'Apache' in str(row):
print(text_lines[i - 1])
你也可以试试这个:
apaches = []
with open('ips.txt') as f:
sections = f.read().split('\n\n')
for section in sections:
_, _, _, product, ip = section.split('\n')
_, product_type = product.split(':')
_, address = ip.split(':')
if product_type.strip().startswith('Apache'):
apaches.append(address.strip())
print('These hosts have Apache services:\n%s' % '\n'.join(apaches))
哪些输出:
These hosts have Apache services:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.6
解释:
with open(filename,'r') as fobj: # Open the file as read only
search_string = fobj.read() # Read file into string
print('These hosts have Apache services:\n\n')
# Split string by search term
for string_piece in search_string.split('Apache'):
# Split string to isolate IP and count up/back 2
ip_addr = string_piece.split('+++++')[-2]
print(ip_addr)
压缩:
with open(filename,'r') as fobj:
print('These hosts have Apache services:\n\n')
for string_piece in fobj.read().split('Apache'):
print('{}\n'.format(string_piece.split('+++++')[-2]))