块大小与 Python 中的多处理/pool.map 无关?
Chunksize irrelevant for multiprocessing / pool.map in Python?
我尝试利用 python 的池多处理功能。
独立于我如何设置块大小(在 Windows 7 和 Ubuntu 下 - 后者见下文 4 核),并行线程的数量似乎保持不变。
from multiprocessing import Pool
from multiprocessing import cpu_count
import multiprocessing
import time
def f(x):
print("ready to sleep", x, multiprocessing.current_process())
time.sleep(20)
print("slept with:", x, multiprocessing.current_process())
if __name__ == '__main__':
processes = cpu_count()
print('-' * 20)
print('Utilizing %d cores' % processes)
print('-' * 20)
pool = Pool(processes)
myList = []
runner = 0
while runner < 40:
myList.append(runner)
runner += 1
print("len(myList):", len(myList))
# chunksize = int(len(myList) / processes)
# chunksize = processes
chunksize = 1
print("chunksize:", chunksize)
pool.map(f, myList, 1)
无论我使用 chunksize = int(len(myList) / processes)
、chunksize = processes
还是 1
(如上例所示),行为都是相同的。
难道是chunksize自动设置为核心数量?
chunksize = 1
示例:
--------------------
Utilizing 4 cores
--------------------
len(myList): 40
chunksize: 10
ready to sleep 0 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 1 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 2 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 3 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 0 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 4 <ForkProcess(ForkPoolWorker-1, started daemon)>
slept with: 1 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 5 <ForkProcess(ForkPoolWorker-2, started daemon)>
slept with: 2 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 6 <ForkProcess(ForkPoolWorker-3, started daemon)>
slept with: 3 <ForkProcess(ForkPoolWorker-4, started daemon)>
ready to sleep 7 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 4 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 8 <ForkProcess(ForkPoolWorker-1, started daemon)>
slept with: 5 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 9 <ForkProcess(ForkPoolWorker-2, started daemon)>
slept with: 6 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 10 <ForkProcess(ForkPoolWorker-3, started daemon)>
slept with: 7 <ForkProcess(ForkPoolWorker-4, started daemon)>
ready to sleep 11 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 8 <ForkProcess(ForkPoolWorker-1, started daemon)>
Chunksize 不影响使用的核心数,这是由 Pool
的 processes
参数设置的。 Chunksize 设置您传递给 Pool.map
的可迭代项的数量,在 Pool
称为 "task" 的每个工作进程中一次 分配 (下图显示Python 3.7.1)。
如果您设置 chunksize=1
,工作进程仅在完成之前收到的项目后才会在新任务中接收新项目。对于 chunksize > 1
,工作人员在一项任务中一次获得一整批物品,完成后,如果还有剩余物品,它会获得下一批物品。
使用 chunksize=1
逐一分发项目增加了调度的灵活性,同时降低了整体吞吐量,因为滴灌需要更多的进程间通信 (IPC)。
在我对Pool的chunksize-algorithm的深入分析中,我定义了unit of work用于处理one item of the iterable as taskel,以避免与 Pool 对单词 "task" 的用法发生命名冲突。一个任务(作为工作单元)由 chunksize
个任务组成。
如果您无法预测任务组需要完成多长时间,例如优化问题,处理时间因任务组而异,您可以设置 chunksize=1
。这里的 Drip-feeding 防止工作进程坐在一堆未触及的项目上,同时在一个沉重的任务上嘎嘎作响,防止他的任务中的其他项目被分配给空闲的工作进程。
否则,如果您的所有 taskel 需要相同的时间完成,您可以设置 chunksize=len(iterable) // processes
,这样任务只在所有 worker 中分配一次。请注意,如果 len(iterable) / processes
有余数,这将比进程(进程 + 1)多产生一个任务。这有可能严重影响您的整体计算时间。在之前链接的答案中阅读更多相关信息。
仅供参考,这是源代码的一部分,如果未设置,Pool
会在内部计算块大小:
# Python 3.6, line 378 in `multiprocessing.pool.py`
if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
if len(iterable) == 0:
chunksize = 0
我尝试利用 python 的池多处理功能。
独立于我如何设置块大小(在 Windows 7 和 Ubuntu 下 - 后者见下文 4 核),并行线程的数量似乎保持不变。
from multiprocessing import Pool
from multiprocessing import cpu_count
import multiprocessing
import time
def f(x):
print("ready to sleep", x, multiprocessing.current_process())
time.sleep(20)
print("slept with:", x, multiprocessing.current_process())
if __name__ == '__main__':
processes = cpu_count()
print('-' * 20)
print('Utilizing %d cores' % processes)
print('-' * 20)
pool = Pool(processes)
myList = []
runner = 0
while runner < 40:
myList.append(runner)
runner += 1
print("len(myList):", len(myList))
# chunksize = int(len(myList) / processes)
# chunksize = processes
chunksize = 1
print("chunksize:", chunksize)
pool.map(f, myList, 1)
无论我使用 chunksize = int(len(myList) / processes)
、chunksize = processes
还是 1
(如上例所示),行为都是相同的。
难道是chunksize自动设置为核心数量?
chunksize = 1
示例:
--------------------
Utilizing 4 cores
--------------------
len(myList): 40
chunksize: 10
ready to sleep 0 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 1 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 2 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 3 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 0 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 4 <ForkProcess(ForkPoolWorker-1, started daemon)>
slept with: 1 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 5 <ForkProcess(ForkPoolWorker-2, started daemon)>
slept with: 2 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 6 <ForkProcess(ForkPoolWorker-3, started daemon)>
slept with: 3 <ForkProcess(ForkPoolWorker-4, started daemon)>
ready to sleep 7 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 4 <ForkProcess(ForkPoolWorker-1, started daemon)>
ready to sleep 8 <ForkProcess(ForkPoolWorker-1, started daemon)>
slept with: 5 <ForkProcess(ForkPoolWorker-2, started daemon)>
ready to sleep 9 <ForkProcess(ForkPoolWorker-2, started daemon)>
slept with: 6 <ForkProcess(ForkPoolWorker-3, started daemon)>
ready to sleep 10 <ForkProcess(ForkPoolWorker-3, started daemon)>
slept with: 7 <ForkProcess(ForkPoolWorker-4, started daemon)>
ready to sleep 11 <ForkProcess(ForkPoolWorker-4, started daemon)>
slept with: 8 <ForkProcess(ForkPoolWorker-1, started daemon)>
Chunksize 不影响使用的核心数,这是由 Pool
的 processes
参数设置的。 Chunksize 设置您传递给 Pool.map
的可迭代项的数量,在 Pool
称为 "task" 的每个工作进程中一次 分配 (下图显示Python 3.7.1)。
如果您设置 chunksize=1
,工作进程仅在完成之前收到的项目后才会在新任务中接收新项目。对于 chunksize > 1
,工作人员在一项任务中一次获得一整批物品,完成后,如果还有剩余物品,它会获得下一批物品。
使用 chunksize=1
逐一分发项目增加了调度的灵活性,同时降低了整体吞吐量,因为滴灌需要更多的进程间通信 (IPC)。
在我对Pool的chunksize-algorithm的深入分析chunksize
个任务组成。
如果您无法预测任务组需要完成多长时间,例如优化问题,处理时间因任务组而异,您可以设置 chunksize=1
。这里的 Drip-feeding 防止工作进程坐在一堆未触及的项目上,同时在一个沉重的任务上嘎嘎作响,防止他的任务中的其他项目被分配给空闲的工作进程。
否则,如果您的所有 taskel 需要相同的时间完成,您可以设置 chunksize=len(iterable) // processes
,这样任务只在所有 worker 中分配一次。请注意,如果 len(iterable) / processes
有余数,这将比进程(进程 + 1)多产生一个任务。这有可能严重影响您的整体计算时间。在之前链接的答案中阅读更多相关信息。
仅供参考,这是源代码的一部分,如果未设置,Pool
会在内部计算块大小:
# Python 3.6, line 378 in `multiprocessing.pool.py`
if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
if len(iterable) == 0:
chunksize = 0