SLURM 中的状态 "CG" 是什么意思?
What does the status "CG" mean in SLURM?
在 SLURM 集群上,可以使用 squeue
获取有关系统上作业的信息。
我知道"R"的意思是running; "PD" 意思是 pending,但是 "CG" 是什么?
根据经验,我理解为 "canceling" 或 "failing",但是当工作成功 c 失败时,"CG" 是否适用?什么是G?
"CG"代表“completing”,它恰好是一个无法终止的工作,可能是因为I/O 操作。
中有更多详细信息
我在 Slurm 故障排除指南的 'squeue' section 中找到了这个:
state
Job state, extended form: PENDING, RUNNING, STOPPED, SUSPENDED,
CANCELLED, COMPLETING, COMPLETED, CONFIGURING, FAILED, TIMEOUT,
PREEMPTED, NODE_FAIL, REVOKED and SPECIAL_EXIT. See the JOB STATE
CODES section below for more information. (Valid for jobs only)
statecompact
Job state, compact form: PD (pending), R (running), CA (cancelled),
CF(configuring), CG (completing), CD (completed), F (failed), TO
(timeout), NF (node failure), RV (revoked) and SE (special exit
state). See the JOB STATE CODES section below for more information.
(Valid for jobs only)
在 SLURM 集群上,可以使用 squeue
获取有关系统上作业的信息。
我知道"R"的意思是running; "PD" 意思是 pending,但是 "CG" 是什么?
根据经验,我理解为 "canceling" 或 "failing",但是当工作成功 c 失败时,"CG" 是否适用?什么是G?
"CG"代表“completing”,它恰好是一个无法终止的工作,可能是因为I/O 操作。
中有更多详细信息我在 Slurm 故障排除指南的 'squeue' section 中找到了这个:
state
Job state, extended form: PENDING, RUNNING, STOPPED, SUSPENDED, CANCELLED, COMPLETING, COMPLETED, CONFIGURING, FAILED, TIMEOUT, PREEMPTED, NODE_FAIL, REVOKED and SPECIAL_EXIT. See the JOB STATE CODES section below for more information. (Valid for jobs only)
statecompact
Job state, compact form: PD (pending), R (running), CA (cancelled), CF(configuring), CG (completing), CD (completed), F (failed), TO (timeout), NF (node failure), RV (revoked) and SE (special exit state). See the JOB STATE CODES section below for more information. (Valid for jobs only)