Python 中的多处理 3 并行处理和等待作业

Question

我有一段查询数据库的代码和 returns 一组 ID。对于每个 ID，我需要运行相关查询来获取数据集。我想运行并行查询以加快处理速度。一旦所有进程都是运行，然后我构建一个文本块并将其写入文件，然后移动到下一个 id。

如何确保所有进程同时启动，然后等待所有进程完成，然后再转到 page =... 和 writefile 操作？
如果运行作为它，我得到以下错误：Process object is not iterable（第 9 行）。

这是我目前的情况：

from helpers import *
import multiprocessing

idSet = getIDset(10) 

for id in idSet:

ds1 = multiprocessing.Process(target = getDS1(id))
ds1list1, ds1Item1, ds1Item2 = (ds1)

    ds2 = multiprocessing.Process(target = getDS2(id))
    ds3 = multiprocessing.Process(target = getDS3(id))
    ds4 = multiprocessing.Process(target = getDS4(id))
    ds5 = multiprocessing.Process(target = getDS5(id))

    movefiles = multiprocessing.Process(moveFiles(srcPath = r'Z://', src = ds1Item2 , dstPath=r'E:/new_data_dump//'))

 ## is there a better way to get them to start in unison than this?
    ds1.start()
    ds2.start()
    ds3.start()
    ds4.start()
    ds5.start()

 ## how do I know all processes are finished before moving on?
    page = +ds1+'\n' \
           +ds2+'\n' \
           +ds3+'\n' \
           +ds4+'\n' \
           +ds5+'\n' 

    writeFile(r'E:/new_data_dump/',filename+'.txt',page)

Answer 1

我通常把我的 "processes" 放在一个列表中。

plist = []
for i in range(0, 5) :
    p = multiprocessing.Process(target = getDS2(id))
    plist.append(p)

for p in plist :
    p.start()


... do stuff ...


for p in plist :
    p.join() # <---- this will wait for each process to finish before continuing

我还认为您在创建流程时遇到了问题。 "target" 应该是一个函数。不是您拥有的函数的结果（除非您的函数 returns 起作用）。

它应该是这样的：

p = Process(target=f, args=('bob',))

其中 target 是函数，args 是一个元组像这样传递的参数：

def f(name) :
    print name

Python 中的多处理 3 并行处理和等待作业

Multiprocessing in Python 3 parallel processing and waiting on jobs

python

multithreading

multiprocessing

python-3.x