Python ValueError: Pool not running in Async Multiprocessing

Python ValueError: Pool not running in Async Multiprocessing

我有一个简单的代码:

path = [filepath1, filepath2, filepath3]

def umap_embedding(filepath):
    file = np.genfromtxt(filepath,delimiter=' ')
    if len(file) > 20000:
        file = file[np.random.choice(file.shape[0], 20000, replace=False), :]
    neighbors = len(file)//200

    if neighbors >= 2:
        neighbors = neighbors
    else:
        neighbors = 2

    embedder = umap.UMAP(n_neighbors=neighbors,
                         min_dist=0.1,
                         metric='correlation', n_components=2)
    embedder.fit(file)
    embedded = embedder.transform(file)
    name = 'file'
    np.savetxt(name,embedded,delimiter=",")

if __name__ == '__main__':
    p = Pool(processes = 20)
    start = time.time()
    for filepath in path:
        p.apply_async(umap_embedding, [filepath])
        p.close()
        p.join()

    print("Complete")
    end = time.time()
    print('total time (s)= ' + str(end-start))

我执行的时候,控制台return报错:

Traceback (most recent call last):
  File "/home/cngc3/CBC/parallel.py", line 77, in <module>
    p.apply_async(umap_embedding, [filepath])
  File "/home/cngc3/anaconda3/envs/CBC/lib/python3.6/multiprocessing/pool.py", line 355, in apply_async
    raise ValueError("Pool not running")
ValueError: Pool not running

我试图在 Whosebug 和 Google 上找到此问题的解决方案,但没有相关问题。 感谢您的帮助。

p.close()p.join() 必须放在 for 循环之后。否则池在循环的第一次迭代中关闭并且在第二次迭代中不接受新作业。