在大文件的内容中获取一行

Getting a line in a large file's content

我想知道如何实现 Aaron Digulla 在这个问题中的回答: Fastest Text search method in a large text file

with open ('test.txt', 'rt') as myfile:
    contents = myfile.read() 
    match = re.search("abc", contents)

下一步是什么,以便我可以找到上一个 EOL 和下一个 EOL,以便我可以提取行?

您可以使用匹配对象的起始索引,使用 str.find and str.rfind 及其 startend 参数来查找上一个和下一个 EOL:

with open ('test.txt', 'rt') as myfile:
    contents = myfile.read() 
    match = re.search("abc", contents)
    start = match.start()
    previous_EOL = contents.rfind('\n', 0, start)
    next_EOL = contents.find('\n', start)
    line = contents[previous_EOL+1: next_EOL]

例如:

contents = '''
This is a sample text
Here is 'abc' in this line.
There are some other lines.'''

match = re.search("abc", contents)
start = match.start()
previous_EOL = contents.rfind('\n', 0, start)
next_EOL = contents.find('\n', start)
line = contents[previous_EOL+1: next_EOL]

print(line)

打印:

Here is 'abc' in this line.

替换

match = re.search("abc", contents)

match = re.search("^.*abc.*$", contents, re.M)

它将匹配包含“abc”的整行。与 re.M 标志一起使用 ^ 匹配行的开头和 $ 它的结尾。

这是一个例子:

import re

s = """
Twinkle, twinkle
Little star!
How I wonder 
What you are!
"""

term = "star"
match = re.search(f"^.*{term}.*$", s, re.M)
print(match.group(0))

它给出:

Little star!