遍历 txt 文件中的行时，如何在正则表达式触发后捕获多个后续行？

Question

我有一个 txt 文件：

This is the first line of block 1. It is always identifiable
Random
Stuff

This is the first line of block 2. It is always identifiable
Is
Always

This is the first line of block 3. It is always identifiable
In
Here!

我想遍历每一行并查找以下代码来触发和捕获以下固定数量的行：

for line in lines:
    match = re.compile(r'(.*)block 2.(.*)'.search(line)
    if match:
        #capture current line and the following 2 lines

解析txt文件后，我想return:

This is the first line of block 2
Is
Always

在我的特定示例中，我的块的第一行始终是可识别的。每个块都有一致的行数。当使用正则表达式时，>= 2 行的内容将始终更改并且无法可靠地 returned。

Answer 1

假设 lines 是一个迭代器，所以你可以从中获取它们。

block2 = re.compile(r'(.*)block 2\n')

for l in lines:
    if block2.search(l):
        res = [l, next(lines), next(lines)]
        break

print(res)

if not lines 不是迭代器，您只需在代码中添加 lines = iter(lines)。

Answer 2

您可以调用next()函数获取迭代器中的下一个元素。

def get_block2(lines):
    for line in lines:
        match = re.compile(r'(.*)block 2\n').search(line)
        if match:
            line2 = next(lines)
            line3 = next(lines)
            return line, line2, line3

遍历 txt 文件中的行时，如何在正则表达式触发后捕获多个后续行？

When iterating through lines in txt file, how can I capture multiple subsequent lines after a regex triggers?

python

regex

readlines