当每个进程需要多个线程时优化 python3 多处理

Question

我有一个包含 12 个项目的列表（每个项目都是一个样本），我需要使用一个函数来分析这些样本中的每一个。该函数是外部管道的包装器，需要四个线程才能运行。一些伪代码：

def my_wrapper(sample, other_arg):
    cmd = 'external_pipeline --threads 4 --sample {0} --other {1}'.format(sample, other_arg)
    subprocess.Popen(shlex.split(cmd),stderr=subprocess.PIPE,stdout=subprocess.PIPE).communicate()

以前，我运行每个样本的函数都使用一个循环连续运行，但效率相对较低，因为我的 CPU 有 20 个核心。示例代码：

sample_list = ['sample_'+str(x) for x in range(1,13)]
for sample in sample_list:
    my_wrapper(sample)

我试过使用多处理来提高效率。我过去使用 multiprocessing 中的 starmap 函数成功地做到了这一点。示例：

with mp.Pool(mp.cpu_count()) as pool:
    results = pool.starmap(my_wrapper, [(sample, other_arg) for sample in sample_list])

当我调用的函数每个进程只需要 1 thread/core 时，这种方法以前很有效。但是，在我目前的情况下，它似乎并没有像我天真地 expect/hope 那样工作。有12个sample，每个sample需要4个thread来分析，但是我一共只有20个thread。因此，我一次 expect/hope 5 个样本运行（5 个样本 * 每个样本 4 个线程 = 总共 20 个线程）。相反，所有样本似乎都是同时分析的，使用了所有 20 个线程，尽管需要 48 个线程才能有效。

我如何有效地运行这些样本，以便只有 5 个运行并行（每个 processes/jobs 使用 4 个线程）？我是否需要指定块大小，或者我用这个想法咆哮错误的树？

对模糊的标题和 post 内容表示歉意，我不确定如何更好地措辞！

Answer 1

限制多处理池中的核心数量将在运行封装您的包装器时将一些核心免费保存到运行。这将为您进行分块：

with mp.Pool(mp.cpu_count() / num_cores_used_in_wrapper) as pool:
    results = pool.starmap(my_wrapper, [(sample, other_arg) for sample in sample_list])

当每个进程需要多个线程时优化 python3 多处理

Optimising python3 multiprocessing when each process requires multiple threads

multiprocessing

python-3.x

starmap