有没有更快的方法从文件中提取行?
Is there a faster way to extract lines from a file?
我有一组文件需要搜索并提取某些行。现在,我正在使用 for
循环,但事实证明这在时间上非常昂贵。有没有比下面更快的方法?
import re
for file in files:
localfile = open(file, 'r')
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
open("words.txt","a+").write(words + "\n")
首先,您每次写入 words.txt 文件时都会创建一个新的文件描述符。
我 运行 进行了一些测试,发现 python 垃圾回收确实会在打开的文件描述符变得不可访问时关闭它们(至少在我的测试用例中)。
但是,每次要附加到文件时都创建一个文件描述符,成本很高。为了将来参考,使用 with as 块打开文件被认为是一种很好的做法。
TLDR:
您可以做的一项改进是只打开您正在写入的文件一次。
这是它的样子:
import re
with open("words.txt","a+") as words_file:
for file in files:
localfile = open(file, 'r')
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
words_file.write(words + "\n")
正如我所说,在打开文件时使用 with as 语句被认为是最佳做法。我们可以像这样完全实施此最佳实践:
import re
with open("words.txt","a+") as words_file:
for file in files:
with open(file, 'r') as localfile:
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
words_file.write(words + "\n")
我有一组文件需要搜索并提取某些行。现在,我正在使用 for
循环,但事实证明这在时间上非常昂贵。有没有比下面更快的方法?
import re
for file in files:
localfile = open(file, 'r')
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
open("words.txt","a+").write(words + "\n")
首先,您每次写入 words.txt 文件时都会创建一个新的文件描述符。 我 运行 进行了一些测试,发现 python 垃圾回收确实会在打开的文件描述符变得不可访问时关闭它们(至少在我的测试用例中)。 但是,每次要附加到文件时都创建一个文件描述符,成本很高。为了将来参考,使用 with as 块打开文件被认为是一种很好的做法。
TLDR: 您可以做的一项改进是只打开您正在写入的文件一次。 这是它的样子:
import re
with open("words.txt","a+") as words_file:
for file in files:
localfile = open(file, 'r')
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
words_file.write(words + "\n")
正如我所说,在打开文件时使用 with as 语句被认为是最佳做法。我们可以像这样完全实施此最佳实践:
import re
with open("words.txt","a+") as words_file:
for file in files:
with open(file, 'r') as localfile:
for line in localfile:
if re.search("Common English Words", line):
words = line.split("|")[0]
# Append words to file words.txt
words_file.write(words + "\n")