使用 re 模块从 python 中的文本文件中获取数据

Question

我有一个名为 file.txt 的文本文件如下所示：

0,  1,  2.     |classes
A0: 1, 2, 3
A1: 1, 2, 3
A2: 1, 2, 3
A3: 1, 2, 3, 4
| Final Pseudo Deletion Count is 0.  Optimisaiton not possible.

我只想从这个文件中取出属性名称：A0、A1、A2、A3。我该怎么做？

我的意思是对于这个特定文件它只是 A0,A1,A2,A3 但我想要一般文件的输出。可以有A0,A1.....An。如下所示：

0,  1,  2.     |classes
A0: 1, 2, 3
A1: 1, 2, 3
A2: 1, 2, 3
A3: 1, 2, 3, 4
A4: 1, 2, 3
A5: 1, 2, 3, 4
| Final Pseudo Deletion Count is 0.  Optimisaiton not possible.

所以在这种情况下，输出将包含 A0, A1, A2, A3, A4, A5。

我已经试过了：

f = open('filename1.txt')
attrib1 = f.readline()
    
attrib = []
for i in range(1, len(attrib1)-1):
    v_pos_colon = attrib1[i].find(':')
    attrib.append(attrib[i][0:v_pos_colon])
print(attrib)

Answer 1

您正在遍历文件第一行中的字符，而不是遍历文件的行。

find() returns -1 当找不到字符串时。因此，当行中没有 : 时，您将添加切片 attrib[i][0:-1]，它将所有内容切片到倒数第二个字符。您应该首先测试是否找到了字符。

attrib = []
with open('filename1.txt') as f:
    for line in f:
        if ':' in line:
            attrib.append(line.split(':')[0])
print(attrib)

使用 re 模块从 python 中的文本文件中获取数据

Fetching data from a text file in python using re module

python

python-3.x

python-re