HTCondor 根据空闲更改 NUM_CPUS?

HTCondor change NUM_CPUS based on Idle?

我想根据是否有人在机器上工作来更改 CPU 计数。不想抢占手册中定义的作业。只需执行以下操作:

// condor_config file
if (KeyboardIdle < 10)
    NUM_CPUS = 2
else
    NUM_CPUS = 8
endif

以上命令失败:(KeyboardIdle < 10) is not a valid if condition because complex conditionals are not supported

我可以用什么方式实现这个,或者 NUM_CPUS 是固定变量吗?


根据 Greg 的回答,我的 condor_config 的最底部如下

NUM_CPUS = 16
START = (SlotID < 8) || (KeyboardIdle > 10)

理论上只允许开始 8 个作业,但 运行 condor_status myMachine 我得到:

C:\>condor_status myMachine
Name                       OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      1.210 8186  0+00:00:02
slot2@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      0.500 8186  0+00:00:03
slot3@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      2.220 8186  0+00:00:01
slot4@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      1.500 8186  0+00:00:02
slot5@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      0.600 8186  0+00:00:02
slot6@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      0.380 8186  0+00:00:02
slot7@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      1.940 8186  0+00:00:03
slot8@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      0.880 8186  0+00:00:02
slot9@myMachine.cluster  WINDOWS    X86_64 Claimed   Busy      1.560 8186  0+00:00:02
slot10@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      0.310 8186  0+00:00:02
slot11@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      2.180 8186  0+00:00:02
slot12@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      1.580 8186  0+00:00:02
slot13@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      0.950 8186  0+00:00:02
slot14@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      1.890 8186  0+00:00:02
slot15@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      0.490 8186  0+00:00:02
slot16@myMachine.cluster WINDOWS    X86_64 Claimed   Busy      1.600 8186  0+00:00:01

               Total Owner Claimed Unclaimed Matched Preempting Backfill  Drain

X86_64/WINDOWS    16     0      16         0       0          0        0      0

         Total    16     0      16         0       0          0        0      0

有什么想法吗?

NUM_CPUS 已在 HTCondor 中修复。通常,此类策略的实施方式是更改 START 表达式,以便有不同数量的槽的 START 表达式计算结果为 false,因此无法启动作业。

假设这台机器有静态插槽(默认),START 表达式可能类似于

START = (SlotID < 3) || (KeyboardIdle > 10)

也就是说,对于插槽 1 和 2,开始始终为真,如果键盘空闲,则对于其余插槽始终为真。

烦人的迂腐,这只根据键盘的使用情况控制作业是否在该机器上启动。通过上述配置,一台完全空闲的机器将允许自己被作业填满,并且当键盘用户 returns 时,这些作业将无限期地 运行 继续。如果你想抢占这些工作,你也可以使用像

这样的抢占表达式
PREEMPT = (SlotID > 3) && (KeyboardIdle < 10)