slurm Job Array 和 Job Step 性能的差异

Difference in slurm Job Array and Job Step performance

我正在运行在 slurm 中处理一组许多并行作业（大约 1000 个），每个作业都必须分配给一个 CPU。阅读 slurm documentation 我发现了这个：

Best Practices, Large Job Counts

Consider putting related work into a single Slurm job with multiple job steps both for performance reasons and ease of management. Each Slurm job can contain a multitude of job steps and the overhead in Slurm for managing job steps is much lower than that of individual jobs.

Job arrays are an efficient mechanism of managing a collection of batch jobs with identical resource requirements. Most Slurm commands can manage job arrays either as individual elements (tasks) or as a single entity (e.g. delete an entire job array in a single command).

这似乎意味着具有许多作业步骤的单个作业（例如，一个具有许多 s运行调用的批处理脚本，每个调用具有相同的资源）比作业数组执行得更好。但我的问题是我不想为其他人阻止资源；如果我运行一个有 1000 s运行调用的作业，一旦它开始运行ning，该作业将不断阻塞大量处理器，但是，如果我运行一个作业数组如果有 1000 个作业，那么这些作业只会使用队列中可用的处理器，我认为这更灵活。

我的问题是：运行在作业步骤上设置作业数组的开销是否足以让我担心这个？如果开销很大，还有其他选择吗？人们通常如何处理这种情况？我见过有人在某些情况下同时使用 GNU 和 slurm，它有什么优势吗？这是一个可能的用例吗？

Is the overhead of running a job array over job steps significant enough for me to worry about this?

这完全取决于一步的持续时间。根据集群的不同，调度和启动作业可能需要几十秒（准备环境、创建临时目录、进行一些清理以及可能的健全性检查或健康检查）。因此，如果一个步骤花费的时间少于几分钟，您肯定需要 'pack' 它们。否则你花在计算上的时间和组织计算的时间一样多。

相比之下，如果一个步骤接近集群允许的最大壁挂时间，则最好使用作业数组。

请注意，您也可以介于两者之间并提交一个大小为 10 的数组，作业运行 100 个步骤。

Is there any alternative if the overhead is large?

您可以使用元调度器和一种有时称为滑入的技术，在这种技术中，您提交的作业除了监听工作流组织者为其提供任务外什么都不做。例如参见 [=15=]

How do people usually deal with this sort of situations?

他们向系统管理员寻求指导，以了解他们更喜欢管理什么。有时有小作业可能会增加集群的总利用率，这很好，有时有很多小作业会降低调度的性能。

I've seen people using GNU parallel with slurm in some circumstances, does it provide any advantage?

GNU Parallel 具有非常强大的工具来生成作业步骤，例如计算一对参数的所有成对可能值，或对文件进行高级通配等。

它还允许用一行替换几行 Bash 来处理所有步骤的开始。

Is this a possible use case?

是的，你可以使用它，但它不会帮助你对你的主要问题做出决定。

slurm Job Array 和 Job Step 性能的差异

Difference in slurm Job Array and Job Step performance

parallel-processing

gnu-parallel

slurm

Best Practices, Large Job Counts