单个 slurm 作业上的两个单独的 mpirun

Question

我想使用一个节点的两个套接字，用于两个单独的 mpi运行 在单个作业提交上，如下所示：

socket    ---- 0 ----                      ---- 1 ----   
core     0 1 ... 14 15                   0 1 ... 14 15
task   mpirun#1 with 16 process      mpirun#2 with 16 process

另外，我想运行没有多线程。

因此，这是我放入 Slurm 头文件中的内容：

#SBATCH --nodes=1 
#SBATCH --sockets-per-node=2 
#SBATCH --cores-per-socket=16 
#SBATCH --threads-per-core=1

请帮助我了解我应该在 [] 中输入什么：

mpirun [something1]  python code_1.py &
mpirun [something2]  python code_2.py

Answer 1

您可以在 mpirun 命令中使用 -npersocket 或 --npersocket #number 或 --map-by。

man mpirun 给出以下内容：

 -npersocket, --npersocket <#persocket>
              On each node, launch this many processes times the number of
              processor sockets on the node.  The -npersocket option also turns on
              the -bind-to-socket option.  (deprecated in favor of --map-by
              ppr:n:socket)

--map-by <foo>
    Map to the specified object, defaults to socket. Supported options include slot, hwthread, core, L1cache, L2cache, L3cache, socket, numa, board, node, sequential, distance, and ppr. Any object can include modifiers by adding a : and any combination of PE=n (bind n processing elements to each proc), SPAN (load balance the processes across the allocation), OVERSUBSCRIBE (allow more processes on a node than processing elements), and NOOVERSUBSCRIBE. This includes PPR, where the pattern would be terminated by another colon to separate it from the modifiers.

您可以尝试这样的操作：

mpirun -n 16 -npersocket  python code_1.py &
mpirun -n 16 -npersocket  python code_2.py

或您可以尝试使用 --map-by ppr:16:socket 选项。这会将 16 个进程映射到一个套接字。还要停止超额订阅使用 --no-oversubscribe.

mpirun -n 16 --map-by ppr:16:socket --no-oversubscribe python code_1.py &
mpirun -n 16 --map-by ppr:16:socket --no-oversubscribe python code_2.py

您还可以使用 -report-bindings 来显示生成的绑定。

如果您想使用 srun，那么还有很多其他选择。如果您正在使用 Intel MPI，那么您可以使用 I_MPI_PIN_PROCESSOR_LIST 环境变量。

Answer 2

事实证明 mpirun 当有更多插槽可用时，它自己不能单独分配单个插槽的核心。

解决方案是使用 numactl 和 mpirun 来暗示所需的策略：

mpirun -np 16   numactl --cpunodebind=0  python code_1.py &
mpirun -np 16   numactl --cpunodebind=1  python code_1.py &
wait

按预期完成工作。

单个 slurm 作业上的两个单独的 mpirun

Two separate mpirun on a single slurm job

sockets

slurm