函数仅打印 5K 文件目录中四个文件的值,Python

Function only prints value for four files from a directory of 5K files, Python

我是 运行 一个代码,它获取 csv 的每一行并在目录的每个文件中找到实体的精确匹配。这里的问题是代码在打印出四个文件的匹配值后终止,而目录中有 5K 个文件。我认为问题出在我的 break or continue 声明上。有人可以帮我解决这个问题吗?到目前为止的代码:

import csv
import os
import re


path = 'C:\Users\Lenovo\.spyder-py3\5KFILES\'

with open('C:\Users\Lenovo\.spyder-py3\codes_file.csv', newline='', encoding ='utf-8') as myFile:
    reader = csv.reader(myFile)
    for filenames in os.listdir(path):
        with open(os.path.join(path, filenames), encoding = 'utf-8') as my:
            content = my.read().lower()
            #print(content)
            for row in reader:
                if len(row[1])>=4:

                #v = re.search(r'(?<!\w){}(?!\w)'.format(re.escape(row[1])), content, re.I)
                    v = re.search(r'\b' + re.escape(row[1]) + r'\b', content, re.IGNORECASE)
                    if v: 
                        print(filenames,v.group(0))
                        break

readerfor 循环之前创建,它是一个迭代器。每次到达 for 行时,迭代都会从它停止的地方继续。一旦到达 reader 的末尾,接下来的 for 循环将为空循环。

您可以在这个简短的示例中看到发生了什么:

l = [0, 1, 2, 3, 4, 5]
iterator = iter(l)

for i in range(0, 16, 2):
    print('i:', i, "- starting the 'for j ...' loop")
    for j in iterator:
        print('iterator:', j)
        if j == i:
            break

i: 0 - starting the 'for j ...' loop
iterator: 0
i: 2 - starting the 'for j ...' loop
iterator: 1
iterator: 2
i: 4 - starting the 'for j ...' loop
iterator: 3
iterator: 4
i: 6 starting the 'for j ...' loop
iterator: 5
i: 8 starting the 'for j ...' loop
i: 10 starting the 'for j ...' loop
i: 12 starting the 'for j ...' loop
i: 14 starting the 'for j ...' loop

每次 for 循环执行时,它会继续在之前停止的 iterator 上迭代。迭代器用完后,for j... 循环为空。

您应该在每个循环中重新启动它:

for row in csv.reader(myFile):
    ....

或列一个清单:

reader = list(csv.reader(myFile))

....

for row in reader:
    ....