HPC 集群:select SLURM sbatch 中的 CPU 和线程数
HPC cluster: select the number of CPUs and threads in SLURM sbatch
sbatch
手册页中使用的术语可能有点令人困惑。因此,我想确保我的选项设置正确。假设我在具有 N 个线程的单个节点上有一个 运行 的任务。我假设我会使用 --nodes=1
和 --ntasks=N
是否正确?
我习惯于考虑使用例如pthreads在单个进程中创建N个线程。结果是他们所说的 "cores" 还是 "cpus per task"? CPU和线程在我看来不是一回事。
Depending on the parallelism you are using: distributed or shared memory
--ntasks=#
: Number of "tasks" (use with distributed parallelism).
--ntasks-per-node=#
: Number of "tasks" per node (use with distributed parallelism).
--cpus-per-task=#
: Number of CPUs allocated to each task (use with shared memory parallelism).
From this question:如果每个节点都有24个核心,这些命令有什么区别吗?
sbatch --ntasks 24 [...]
sbatch --ntasks 1 --cpus-per-task 24 [...]
Answer:(作者:马修·梅尔德)
Yes there is a difference between those two submissions. You are correct that usually ntasks
is for mpi
and cpus-per-task
is for multithreading, but let’s look at your commands:
For your first example, the sbatch --ntasks 24 […]
will allocate a job with 24 tasks. These tasks in this case are only 1 CPUs, but may be split across multiple nodes. So you get a total of 24 CPUs across multiple nodes.
For your second example, the sbatch --ntasks 1 --cpus-per-task 24 [...]
will allocate a job with 1 task and 24 CPUs for that task. Thus you will get a total of 24 CPUs on a single node.
In other words, a task cannot be split across multiple nodes. Therefore, using --cpus-per-task
will ensure it gets allocated to the same node, while using --ntasks
can and may allocate it to multiple nodes.
Another good Q&A from CÉCI's support website:假设你需要16个核心。以下是一些用例:
- you use mpi and do not care about where those cores are distributed:
--ntasks=16
- you want to launch 16 independent processes (no communication):
--ntasks=16
- you want those cores to spread across distinct nodes:
--ntasks=16 and --ntasks-per-node=1
or --ntasks=16 and --nodes=16
- you want those cores to spread across distinct nodes and no interference from other jobs:
--ntasks=16 --nodes=16 --exclusive
- you want 16 processes to spread across 8 nodes to have two processes per node:
--ntasks=16 --ntasks-per-node=2
- you want 16 processes to stay on the same node:
--ntasks=16 --ntasks-per-node=16
- you want one process that can use 16 cores for multithreading:
--ntasks=1 --cpus-per-task=16
- you want 4 processes that can use 4 cores each for multithreading:
--ntasks=4 --cpus-per-task=4
sbatch
手册页中使用的术语可能有点令人困惑。因此,我想确保我的选项设置正确。假设我在具有 N 个线程的单个节点上有一个 运行 的任务。我假设我会使用 --nodes=1
和 --ntasks=N
是否正确?
我习惯于考虑使用例如pthreads在单个进程中创建N个线程。结果是他们所说的 "cores" 还是 "cpus per task"? CPU和线程在我看来不是一回事。
Depending on the parallelism you are using: distributed or shared memory
--ntasks=#
: Number of "tasks" (use with distributed parallelism).
--ntasks-per-node=#
: Number of "tasks" per node (use with distributed parallelism).
--cpus-per-task=#
: Number of CPUs allocated to each task (use with shared memory parallelism).
From this question:如果每个节点都有24个核心,这些命令有什么区别吗?
sbatch --ntasks 24 [...]
sbatch --ntasks 1 --cpus-per-task 24 [...]
Answer:(作者:马修·梅尔德)
Yes there is a difference between those two submissions. You are correct that usually
ntasks
is formpi
andcpus-per-task
is for multithreading, but let’s look at your commands:For your first example, the
sbatch --ntasks 24 […]
will allocate a job with 24 tasks. These tasks in this case are only 1 CPUs, but may be split across multiple nodes. So you get a total of 24 CPUs across multiple nodes.For your second example, the
sbatch --ntasks 1 --cpus-per-task 24 [...]
will allocate a job with 1 task and 24 CPUs for that task. Thus you will get a total of 24 CPUs on a single node.In other words, a task cannot be split across multiple nodes. Therefore, using
--cpus-per-task
will ensure it gets allocated to the same node, while using--ntasks
can and may allocate it to multiple nodes.
Another good Q&A from CÉCI's support website:假设你需要16个核心。以下是一些用例:
- you use mpi and do not care about where those cores are distributed:
--ntasks=16
- you want to launch 16 independent processes (no communication):
--ntasks=16
- you want those cores to spread across distinct nodes:
--ntasks=16 and --ntasks-per-node=1
or--ntasks=16 and --nodes=16
- you want those cores to spread across distinct nodes and no interference from other jobs:
--ntasks=16 --nodes=16 --exclusive
- you want 16 processes to spread across 8 nodes to have two processes per node:
--ntasks=16 --ntasks-per-node=2
- you want 16 processes to stay on the same node:
--ntasks=16 --ntasks-per-node=16
- you want one process that can use 16 cores for multithreading:
--ntasks=1 --cpus-per-task=16
- you want 4 processes that can use 4 cores each for multithreading:
--ntasks=4 --cpus-per-task=4