多于多进程池大小的进程,多出来的进程什么时候运行?
More processes than multiprocess pool size, when do the extra processes run?
假设我有一个大小为 5 的池和一个我想调用 10 次的工人。假设工作人员非常 cpu 密集并且需要几分钟才能完成其任务。
pool = multiprocessing.Pool(processes=5)
for i in range(10):
pool.apply_async(pool_worker)
第 5 次迭代后,池被填满。其余的工人电话会怎样?他们是否排队等到前面的工人完成?
文档对此不是很明确。一般而言:一旦进程完成并释放,它就会捕获下一个可用任务(因此:排队)。如果你尝试
from multiprocessing import Pool
from time import sleep
def sleeping(i):
print(f"{i} started")
sleep(5)
print(f"{i} ended")
if __name__ == "__main__":
with Pool(processes=5) as p:
results = [p.apply_async(sleeping, args=(i,)) for i in range(10)]
results = [result.get() for result in results]
然后你会得到这样的结果
0 started
1 started
2 started
3 started
4 started
3 ended
0 ended
5 started
6 started
1 ended
7 started
2 ended
8 started
4 ended
9 started
5 ended
6 ended
7 ended
8 ended
9 ended
根据框架的不同,也可能是一旦一个进程完成了它的工作量,它就会终止,一个新的开始,然后下一个可用的任务由新进程接管。 From the docs:
Note Worker processes within a Pool typically live for the complete duration of the Pool’s work queue. A frequent pattern found in other systems (such as Apache, mod_wsgi, etc) to free resources held by workers is to allow a worker within a pool to complete only a set amount of work before being exiting, being cleaned up and a new process spawned to replace the old one. The maxtasksperchild argument to the Pool exposes this ability to the end user.
假设我有一个大小为 5 的池和一个我想调用 10 次的工人。假设工作人员非常 cpu 密集并且需要几分钟才能完成其任务。
pool = multiprocessing.Pool(processes=5)
for i in range(10):
pool.apply_async(pool_worker)
第 5 次迭代后,池被填满。其余的工人电话会怎样?他们是否排队等到前面的工人完成?
文档对此不是很明确。一般而言:一旦进程完成并释放,它就会捕获下一个可用任务(因此:排队)。如果你尝试
from multiprocessing import Pool
from time import sleep
def sleeping(i):
print(f"{i} started")
sleep(5)
print(f"{i} ended")
if __name__ == "__main__":
with Pool(processes=5) as p:
results = [p.apply_async(sleeping, args=(i,)) for i in range(10)]
results = [result.get() for result in results]
然后你会得到这样的结果
0 started
1 started
2 started
3 started
4 started
3 ended
0 ended
5 started
6 started
1 ended
7 started
2 ended
8 started
4 ended
9 started
5 ended
6 ended
7 ended
8 ended
9 ended
根据框架的不同,也可能是一旦一个进程完成了它的工作量,它就会终止,一个新的开始,然后下一个可用的任务由新进程接管。 From the docs:
Note Worker processes within a Pool typically live for the complete duration of the Pool’s work queue. A frequent pattern found in other systems (such as Apache, mod_wsgi, etc) to free resources held by workers is to allow a worker within a pool to complete only a set amount of work before being exiting, being cleaned up and a new process spawned to replace the old one. The maxtasksperchild argument to the Pool exposes this ability to the end user.