Python:使用生成器在 str.rstrip() 的 for 循环中解析多行记录
Python: Parsing multi line-records in a for loop with str.rstrip() using generators
我需要去除一个大文件中的换行符f
我一次解析两行。例如像这样。
def foo(f):
with open(f, "r") as f:
for n, s in zip(f, f):
#something with `n` and `s`
是否可以直接在 for
循环行中 str.rstrip
还是我需要在 for
循环体内单独进行。这不起作用 for n, s in zip(f.rstrip(), f.rstrip()):
(更新问题以使其更简洁)
更新:
这些解决方案来自 Barmar 的回答和下面 PaulCornelius 的评论:
def foo2(f):
with open(f, "r") as f:
for n, s in zip ((line.rstrip() for line in f), (line.rstrip() for line in f)):
#Do something with `n` and `s`
和
def foo3(f):
with open(f, "r") as f:
g = (line.rstrip() for line in f)
for n, s in zip(g, g):
#Do something with `n` and `s`
更新 2:
如果你想解析多个文件,可以为每个文件制作一个生成器(这里是两个文件):
def foo4(f1, f2):
with open(f1, "r") as f1, open(f2, "r") as f2:
g1, g2 = (line.rstrip() for line in f1), (line.rstrip() for line in f2)
for n1, s1, n2, s2 in zip(g1, g1, g2, g2):
#Do something with `n1`, `s2, `n2` and `s2`
您可以使用列表解析。
for n1, n2 in zip([line.rstrip() for line in f1], [line.rstrip() for line in f1]):
但这不会一次处理两行文件。每个列表理解处理整个文件,因此 n1
和 n2
将是相同行的副本。
我认为你可以使用生成器表达式来获得你想要的东西,因为它们很懒惰:
for n1, n2 in zip ((line.rstrip() for line in f1), (line.rstrip() for line in f1)):
我需要去除一个大文件中的换行符f
我一次解析两行。例如像这样。
def foo(f):
with open(f, "r") as f:
for n, s in zip(f, f):
#something with `n` and `s`
是否可以直接在 for
循环行中 str.rstrip
还是我需要在 for
循环体内单独进行。这不起作用 for n, s in zip(f.rstrip(), f.rstrip()):
(更新问题以使其更简洁)
更新:
这些解决方案来自 Barmar 的回答和下面 PaulCornelius 的评论:
def foo2(f):
with open(f, "r") as f:
for n, s in zip ((line.rstrip() for line in f), (line.rstrip() for line in f)):
#Do something with `n` and `s`
和
def foo3(f):
with open(f, "r") as f:
g = (line.rstrip() for line in f)
for n, s in zip(g, g):
#Do something with `n` and `s`
更新 2:
如果你想解析多个文件,可以为每个文件制作一个生成器(这里是两个文件):
def foo4(f1, f2):
with open(f1, "r") as f1, open(f2, "r") as f2:
g1, g2 = (line.rstrip() for line in f1), (line.rstrip() for line in f2)
for n1, s1, n2, s2 in zip(g1, g1, g2, g2):
#Do something with `n1`, `s2, `n2` and `s2`
您可以使用列表解析。
for n1, n2 in zip([line.rstrip() for line in f1], [line.rstrip() for line in f1]):
但这不会一次处理两行文件。每个列表理解处理整个文件,因此 n1
和 n2
将是相同行的副本。
我认为你可以使用生成器表达式来获得你想要的东西,因为它们很懒惰:
for n1, n2 in zip ((line.rstrip() for line in f1), (line.rstrip() for line in f1)):