如何使用 python 将多个文本文件中的整数相加到一个新的文本文件中？

Question

我目前正在尝试获取十个不同的文本文件（file2_0.txt、file2_1.txt、file2_2.txt、...），它们都包含一列和一亿行随机整数并逐行添加文本文件。我想将所有十个文件中的每一行加在一起，并生成一个包含每一行总和的新文本文件 (total_file.txt)。下面是我尝试使用两个文件加在一起创建 total_file.txt.

的示例

file2_0.txt

file2_1.txt

total_file.txt

由于这些文件相当大，我不会尝试将它们读入内存，而是使用并发。我从另一个 Whosebug () 问题中找到了示例代码，我在一次处理所有文件之前尝试过这个问题。我遇到的问题是输出 (total_file.txt) 只包含第二个文本文件 (file2_1.txt) 中的数字，没有添加任何内容。我不确定这是为什么。我是 Whosebug 和一般编码的新手，想在链接 post 上询问这个问题，但是，我在网上看到这不是好的做法。下面是我处理的代码。

import shutil
#Files to add
filenames = ['file2_0.txt', 'file2_1.txt']`
sums = []

with open('file2_0.txt') as file:
    for row in file:
        sums.append(row.split())
#Create output file
with open('total_file.txt', 'wb') as wfd:
    for file in filenames:
        with open(file) as open_file:
            for i, row in enumerate(open_file):
                sums[i] = sums[i]+row.split()

    with open(file, 'rb') as fd:
        shutil.copyfileobj(fd, wfd)

只是为了背景，我正在使用这些大文件来测试处理速度。一旦我了解我做错了什么，我将致力于并行处理，特别是多线程，以测试各种处理速度。请告诉我您可能需要我提供的更多信息。

Answer 1

我会使用生成器，这样您就不必一次将所有文件加载到内存中（以防它们很大）

然后从每个生成器中提取下一个值，对它们求和，写入它们并继续。当您到达文件末尾时，您将获得 StopIteration 异常并完成

def read_file(file):
  with open(file, "r") as inFile:
    for row in inFile:
      yield row

file_list = ["file1.txt", "file2.txt", ..., "file10.txt"]
file_generators = [read_file(path) for path in file_list]

with open("totals.txt", "w+") as outFile:
  while True
    try:
      outFile.write(f"{sum([int(next(gen)) for gen in file_generators])}\n")
    except StopIteration:
      break

如何使用 python 将多个文本文件中的整数相加到一个新的文本文件中？

How can I sum integers from multiple text files into a new text file using python?

python

sum

text-files