主线程不会等待所有 ThreadPoolExecutor 线程完成

The main thread is not waiting until all the ThreadPoolExecutor threads finishes

ThreadPoolExecutor 创建的线程在 for 循环中的第一次迭代后返回。主线程不会等到整个 for 循环完成。进一步检查我意识到如果我只用一些虚拟打印替换 re.sub 到 stdout 循环就完全执行了。在线程中使用 re.sub() 有什么问题?

import concurrent.futures
import threading


def process_file(file):
    with open(file, 'rb+') as in:
        mm = mmap.mmap(in.fileno(),0)
        for i in range(len(global_list)):
            mm = re.sub(global_list[i], global_ch_list[i],mm)

    with open(file, 'wb+') as out:
        out.write(mm)


def process_all_files(files):
    with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
        executor.map(process_file, files)


process_all_files(files)

您的代码内部有各种错误但已静音,为了查看错误,您需要使用 Executor.map 的返回迭代器,引用自手册:

If a func call raises an exception, then that exception will be raised when its value is retrieved from the iterator.

例如:

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    results = executor.map(process_file, files)
    print(list(results))