在不打开文件的情况下读取文件中的前 N 行 (Python)
reading the first N lines in a file without opening it (Python)
我有一个 Python 脚本需要读取一个非常大的文本文件的一部分,从第 N 行开始到 N+X 结束。
我不想使用 "open('file')",因为那样会将整个内容写入内存,这既会花费太长时间,又会浪费太多内存。
我的脚本在 Unix 机器上运行,所以我目前使用本机的 head 和 tail 函数,即:
section = subprocess.check_output('tail -n-N {filePath} | head -n X')
但感觉必须有更聪明的方法来做到这一点..
有没有办法在不打开整个文件的情况下获取 Python 中文本文件的 N 到 N+X 行?
谢谢!
您问题的答案位于此处:How to read large file, line by line in python
with open(...) as f:
for line in f:
<do something with line>
The with statement handles opening and closing the file, including if
an exception is raised in the inner block. The for line in f treats
the file object f as an iterable, which automatically uses buffered IO
and memory management so you don't have to worry about large files.
Python 的 islice()
很适合这样做:
from itertools import islice
N = 2
X = 5
with open('large_file.txt') as f_input:
for row in islice(f_input, N-1, N+X):
print row.strip()
这将跳过所有初始行,只跳过 returns 您感兴趣的行。
我有一个 Python 脚本需要读取一个非常大的文本文件的一部分,从第 N 行开始到 N+X 结束。 我不想使用 "open('file')",因为那样会将整个内容写入内存,这既会花费太长时间,又会浪费太多内存。 我的脚本在 Unix 机器上运行,所以我目前使用本机的 head 和 tail 函数,即:
section = subprocess.check_output('tail -n-N {filePath} | head -n X')
但感觉必须有更聪明的方法来做到这一点.. 有没有办法在不打开整个文件的情况下获取 Python 中文本文件的 N 到 N+X 行?
谢谢!
您问题的答案位于此处:How to read large file, line by line in python
with open(...) as f:
for line in f:
<do something with line>
The with statement handles opening and closing the file, including if an exception is raised in the inner block. The for line in f treats the file object f as an iterable, which automatically uses buffered IO and memory management so you don't have to worry about large files.
Python 的 islice()
很适合这样做:
from itertools import islice
N = 2
X = 5
with open('large_file.txt') as f_input:
for row in islice(f_input, N-1, N+X):
print row.strip()
这将跳过所有初始行,只跳过 returns 您感兴趣的行。