为什么 I/O 不与多线程 Python 应用程序中的计算重叠？

Question

我编写了一个简单的 Python 脚本来测试 I/O-bound 和 cpu 绑定线程之间的重叠。代码在这里：

from datetime import datetime
import threading
import shutil
import os


def cpuJob(start,end):
    counter=start
    sum=0
    while counter<=end:
        sum+=counter
        counter+=1
    return sum


def ioJob(from_path, to_path):
    if os.path.exists(to_path):
        shutil.rmtree(to_path)
    shutil.copytree(from_path, to_path)

startTime=datetime.now()

Max=120000000
threadCount=2

if threadCount==1:
    t1 = threading.Thread(target=cpuJob, args=(1,Max))
    # t1 = threading.Thread(target=ioJob, args=(1,Max))
    t1.start()
    t1.join()
else:
    t1 = threading.Thread(target=ioJob, args=("d:\1","d:\2"))
    t2 = threading.Thread(target=cpuJob, args=(1,Max))
    t1.start()
    t2.start()
    t1.join()
    t2.join()

endTime=datetime.now()

diffTime = endTime - startTime

print("Execution time for " , threadCount , " threads is: " , diffTime)

如果我运行线程分开 (threadCount==1)，每个线程在我的 Windows 笔记本电脑上大约需要 12-13 秒才能完成。但是当我运行它们在一起时 (threadCount==2)，大约需要 20-22 秒。据我所知，Python 在执行任何阻塞 I/O 操作之前释放 GIL。如果在使用I/O之前发布了GIL，为什么我的代码会得到这样的性能？

编辑 1: 按照 commnets 的建议，我检查了 shutils 的代码。看来在这个包的实现中，并没有释放GIL。为什么会这样？ shell 实用程序包的代码应该落在 Python 运行时间实现之外，不是吗？

Answer 1

... why I get such performance ?

见https://docs.python.org/3/library/threading.html:

CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

您的代码在非抢占式框架中运行，但它永远不会放弃控制权，直到它退出。所以直到那时才会安排另一个线程。您使用了一些线程机制，但您也可能编写了一个调用 io_job() 后跟 cpu_job().

的两行顺序函数

您要找的是multiprocessing。

此外，如果您真的想使用 rsync 等工具复制文件树，请考虑使用 gmake -jN 或 GNU parallel (sudo apt install parallel)。这是一个示例命令：

$ find . -name '*.txt' -type f | parallel gzip -v9

make 和 /usr/bin/parallel 都允许您指定并发工作人员的数量，并且会在每次工作人员完成任务时继续从队列中抽取新任务。

Answer 2

根据我机器上的 /usr/lib/python3.6/shutil.py，这些函数 rmtree、copytree 等似乎是作为 Python 代码实现的 _rmtree_unsafe。 rmtree等背后的底层API如os.listdir和os.unlink.

由于Python GIL的限制，一次只能一个线程运行 Python编码。因此，您的 cpuJob 和 ioJob 不能同时（并行）运行因为它们都是纯 Python 代码，所以当您尝试运行他们是 "threads".

为什么 I/O 不与多线程 Python 应用程序中的计算重叠？

Why I/O is not overlapped with computing in a multithreaded Python application?

python-multithreading

python-3.x