使用 SLURM bash 脚本进行并行化和资源分配

Question

我有可能在 HPC 环境中进行处理，其中任务管理和资源分配由 SLURM 批处理作业系统控制。但是，我还没有找到正确的配置如何有效地利用 R 中分配的资源。我试图在 SLURM 中分配 20 CPU 到一个任务，使用计划（多核）-R 中未来包的功能。在运行测试运行后 CPUs 的不同计数已分配，效率统计表明，在测试运行期间，使用这些设置仅使用了 CPU 的分配之一。

Slurm bash 脚本如下所示

#!/bin/bash
#SBATCH --job-name=pointsToRaster
#SBATCH --account=project_num
#SBATCH --time=00:10:00
#SBATCH --output=output_%j.txt
#SBATCH --error=error_%j.txt
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=20
#SBATCH --mem-per-cpu=15G
#SBATCH --partition=small

#A 10 MINUTE LONG TEST RUN

#load module
module load r-env-singularity

# Bind threads to individual cores
export OMP_PROC_BIND=true

#Run script
srun singularity_wrapper exec Rscript --no-save pointClouds.R

此 bash 脚本分配资源并执行脚本 pointClouds.R。在 R 脚本中，availableCores() 确保找到保留的 CPU，而 supportsMulticore() 确保环境和设置支持多核处理。脚本内容如下。

#set working directory
setwd(dir = "/scratch/project_2001456/lasFiles")

#load packages
library(sf)
library(sp)
library(raster)
library(rgdal)
library(lidR)
library(future)

####### SET COMPUTATIONAL CONFIGURATIONS ##########

# Parallelization settings:
print(availableCores()) #ensure that reserved CPU's are found
print(supportsMulticore())#ensure that environment and settings support multicore-processing
plan(multicore) #strategy to for resolving a future.

#From here on out, I have split one .las file to multiple chunks by their spatial extent. Aim to process all the chunks utilizing parallel processing, but process seems to utilize only one of the allocated CPUs.

关于如何使用 SLURM bash 脚本正确分配资源，并在 R 中的并行处理中有效利用它们的任何帮助？

Answer 1

HPC 服务提供商发现了问题。由于未知原因 OMP_PLACES=cores 应该将 threads/processes 绑定到特定内核的变量，似乎仅在运行多核 R 作业时才将所有进程绑定到单个内核。问题已通过重建 r-environment singularity-container 解决。

使用 SLURM bash 脚本进行并行化和资源分配

Parallelization and resource allocation with SLURM bash script

parallel-processing

r

slurm