为什么我的 Python 并行过程运行太多次了？

Question

我编写了一个对化学反应系统进行随机模拟的函数。

def gillespie_tau_leaping(start_state, LHS, stoch_rate, state_change_array): # inputs are a series of arrays
    t = SimulationTimer()
    t.start()
    #update molecule numbers for each species in model
    #update current time of the system 
    t.stop()
    print(f"Accumulated time: {t.get_accumulated_time():0.10f} seconds")
return popul_num_all, tao_all, t.get_accumulated_time() # popul_num_all is an array of changing molecule numbers over time, tao_all is the evolution of time throughout the simulation

popul_num_all, tao_all, accumulated_elapsed_time = gillespie_tau_leaping(start_state, LHS, stoch_rate, state_change_array)  # Call the function to make variables accessible for plotting.

我现在已经使用 Python 多次将以下代码写入运行 gillespie_tau_leaping 函数 multiprocessing pool class.

if __name__ == '__main__':
    with Pool() as p:
        pool_results = p.starmap(gillespie_tau_leaping, [(start_state, LHS, stoch_rate, state_change_array) for i in range(4)])
        p.close()
        p.join()   
        total_time = 0.0
        for tuple_results in pool_results:
            total_time += tuple_results[2]
    print(f"Total time:\n{total_time}") 


def gillespie_plot(tao_all, popul_num):
    fig, (ax1, ax2) = plt.subplots(1, 2)
    ax1.plot(tao_all, popul_num_all[:, 0], label='S', color= 'Green')
    ax1.legend()
    for i, label in enumerate(['T', 'U']):
        ax2.plot(tao_all, popul_num_all[:, i+1], label=label)
    ax2.legend()
    plt.tight_layout()
    plt.show()
    return fig

gillespie_plot(tao_all, popul_num_all)

gillespie_plot，绘制随着时间 tao_all.

改变分子数 popul_num_all 的结果

只有当我运行这段代码时它才模拟 gillespie_tau_leaping 9 次。第一次是因为我调用了函数，使一些变量可访问。接下来的 8 个模拟我不明白，前 4 个模拟系统并绘制图形但不 return total_time 并行模拟。第二个 4 次模拟不绘制图表，但执行 return 并行模拟的 total_time。

函数调用后我只进行了 expecting/wanting 4 次模拟，绘制了图表 return total_time

我做错了什么？

干杯

Answer 1

如果您运行子进程中的代码，模块（文件）将再次执行。 if __name__ == '__main__': 用作防止某些代码运行ning.

的保护措施

在您的情况下 gillespie_plot(tao_all, popul_num_all) 不受保护，并且将运行在每个子进程中编码，应该运行只有一次。

添加与上方代码块相同的if条件以防止：

[…]
if __name__ == '__main__':
    gillespie_plot(tao_all, popul_num_all)

Answer 2

你的程序分解如下：

启动时，主进程定义 gillespie_tau_leaping()，然后将其称为 popul_num_all, tao_all, accumulated_elapsed_time = gillespie_tau_leaping(start_state, LHS, stoch_rate, state_change_array)
主进程评估__name__ == '__main__'为真，因此启动多处理池并重复以下步骤4次
1. 子进程加载，然后定义并调用 gillespie_tau_leaping()（与其他步骤 1 相同）
2. 子进程评估 __name__ == '__main__' 为 false 并且不创建新池。
3. 子流程然后使用步骤 2.1
4. 子进程接收来自 starmap 调用的调用 gillespie_tau_leaping() 的请求并处理它，返回结果。
主进程接收starmap调用结果（2.4）并打印出时间结果
主进程使用步骤 1
中的参数定义并调用 gillespie_plot()

要运行你的代码只有4次你应该做的：

def gillespie_tau_leaping(start_state, LHS, stoch_rate, state_change_array):
    ...

def gillespie_plot(tao_all, popul_num):
    fig, (ax1, ax2) = plt.subplots(1, 2)
    ax1.plot(tao_all, popul_num_all[:, 0], label='S', color= 'Green')
    ax1.legend()
    for i, label in enumerate(['T', 'U']):
        ax2.plot(tao_all, popul_num_all[:, i+1], label=label)
    ax2.legend()
    plt.tight_layout()
    # plt.show() # this can block
    return fig

if __name__ == '__main__':
    with Pool() as p:
        pool_results = p.starmap(gillespie_tau_leaping, [(start_state, LHS, stoch_rate, state_change_array) for i in range(4)])
        # pool is implicitly closed and joined at the end of the with block. 
    
    total_time = 0.0
    for tuple_results in pool_results:
        total_time += tuple_results[2]
    print(f"Total time:\n{total_time}")

    for tao_all, popul_num_all, _total_time in pool_results:
        gillespie_plot(tao_all, popul_num_all)
    plt.show()

为什么我的 Python 并行过程运行太多次了？

Why is my Python parallel process running too many times?

python

parallel-processing

pool

function

multiprocessing

为什么我的 Python 并行过程 运行 太多次了？

Why is my Python parallel process running too many times?

python

parallel-processing

pool

function

multiprocessing

为什么我的 Python 并行过程运行太多次了？