Python 3 - CSV reader 到达文件末尾后,CSV 编写器循环不会关闭

Python 3 - CSV writer loop won't close after CSV reader reaches end of file

我正在尝试创建一个将大型 CSV 文件拆分为较小文件的程序。我的功能运行良好,除了它永远不会关闭最后一个文件,这意味着它永远不会完成对该文件的写入。这是我得到的:

import csv

length of original file = 1000 rows
length_of_new_file = 100  # rows


def file_splitter(file_name, desired_length):
    with open("{}".format(file_name), 'r') as original_file:
        header = original_file.readline()
        file_reader = csv.reader(original_file,dialect='excel')
        file_count = 0
        new_name = 'split_file_test'
        loop = 0
        while file_reader:
            with open("{}{}.csv".format(new_name, file_count), 'w', newline='') as new_file:
                new_file.write(header)
                csv_writer = csv.writer(new_file, delimiter=',')
                for line in file_reader:
                    if loop == (desired_length-1):
                        csv_writer.writerow(line)
                        new_file.close()
                        file_count += 1
                        loop = 0
                        break
                    else:
                        csv_writer.writerow(line)
                        loop += 1


test_file = 'zlotsacontacts.csv'

file_splitter(test_file, length_of_new_file)

我试过添加 new_file.close(),但无论我把它放在哪里,最后一个文件似乎永远不会关闭。我还在最外层的 while 循环中尝试了不同的逻辑,例如:

while file_reader != '':

while file_reader not None:

但据我所知,CSV 模块无法识别 None 值。我不确定我能做些什么来结束这个循环!

with open完成后会自动关闭文件。

while 循环陷入无限循环,因为它检查的条件只是 while file_reader

file_reader 存在,因此它将保持真实。

更好的方法是使用考虑文件数量的循环。

类似于:

while file_count < number_of_files:
     ...

或者举个例子:

num_files = 5

count = 0

while count < num_files:
    print(n_files)
    count += 1

这样 while 循环将在完成对所有文件的迭代并最终关闭最后一个文件时中断

如果你需要找出文件中有多少行,你可以像这样计算它们

import csv

with open('lines.csv') as lines:
    l = csv.reader(lines) # will read in larger files much better
    row_count = sum(1 for row in l) - 1 # -1 to not count the header row, if it exists.
print(row_count)

我应该多花点时间考虑清楚。通过将 'for line' 移动到最外层循环,我可以检查是否有新文件(并在完成后删除它),这解决了无限循环问题:

def file_splitter(submitted_file, desired_length):
    with open(submitted_file, 'r') as original_file:
        header = original_file.readline()
        file_reader = csv.reader(original_file, dialect='excel')
        file_count = 0
        new_name = 'a_file_test'
        loop = 0
        new_file = None
        csv_writer = None
        for line in file_reader:
            if new_file is None or loop == 0:
                new_file = open('{0}{1}.csv'.format(new_name, file_count), 'w', newline='')
                new_file.write(header)
                csv_writer = csv.writer(new_file, delimiter=',')
            csv_writer.writerow(line)
            loop += 1
            if loop == desired_length - 1:
                new_file.close()
                file_count += 1
                loop = 0