Python - 将包含 IP 地址和不同数据的文本文件列表转换为 CSV

Python - Convert a list from a text file containing IP's addresses and dissimilar data, to CSV

我的任务是从多个文本文件创建 NFS 共享和关联 IP 的列表并将其保存为 CSV。该文件包含 NFS 共享名称和 IP 以及我不希望包含在 CSV 中的其他数据

文本文件示例:

/vol/vm-01
-sec=sys,rw=10.44.160.133:10.44.160.132:10.44.160.131:10.44.160.130,root=10.44.160.133:10.44.160.132:10.44.160.131:10.44.160.130 /vol/vol01
-sec=sys,rw=10.44.202.39:10.44.202.73,root=10.44.202.39:10.44.202.73

我使用了正则表达式并很容易地过滤掉了 IP,但找不到合并卷名的方法。


# Scrape file for IP's using RegEx
with open('input.txt') as f:
    qlist = [re.findall( r'[0-9]+(?:\.[0-9]+){3}', i ) for i in f.readlines()]
    for x in range(len(qlist)):
        print(qlist[x])

示例输出:

['10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130', '10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130'] ['10.44.202.39', '10.44.202.73', '10.44.202.39', '10.44.202.73']

期望的输出:

['vm-01', '10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130', '10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130'] ['vol01', '10.44.202.39', '10.44.202.73', '10.44.202.39', '10.44.202.73']

这个表达式 return 两个所需的输出,您可以简单地编写其余部分的脚本以获得最终输出:

测试

import re

regex = r"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|\/[^\/]+\/([^\/]+?)(?=\s+-sec)"
test_str = "/vol/vm-01 -sec=sys,rw=10.44.160.133:10.44.160.132:10.44.160.131:10.44.160.130,root=10.44.160.133:10.44.160.132:10.44.160.131:10.44.160.130 /vol/vol01 -sec=sys,rw=10.44.202.39:10.44.202.73,root=10.44.202.39:10.44.202.73"

print(re.findall(regex, test_str))

输出

[('', 'vm-01'), ('10.44.160.133', ''), ('10.44.160.132', ''), ('10.44.160.131', ''), ('10.44.160.130', ''), ('10.44.160.133', ''), ('10.44.160.132', ''), ('10.44.160.131', ''), ('10.44.160.130', ''), ('', 'vol01'), ('10.44.202.39', ''), ('10.44.202.73', ''), ('10.44.202.39', ''), ('10.44.202.73', '')]

该表达式在 regex101.com, if you wish to explore/simplify/modify it, and in this link 的右上面板进行了解释,如果您愿意,您可以观察它如何与一些示例输入匹配。

这是完成这项工作的一种方法:

import re

qlist = []
with open('input.txt') as f:
    for line in f.readlines():
        tmp = []
        # search the volume name
        m = re.search(r'/vol/(\S+)', line)
        tmp.append(m.group(1))
        # loop on all IPs
        for i in re.findall( r'[0-9]+(?:\.[0-9]+){3}', line ):
            tmp.append(i)
        qlist.append(tmp)
for x in range(len(qlist)):
    print(qlist[x])

输出:

['vm-01', '10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130', '10.44.160.133', '10.44.160.132', '10.44.160.131', '10.44.160.130']
['vol01', '10.44.202.39', '10.44.202.73', '10.44.202.39', '10.44.202.73']