限制通过 systemd 管理的进程的交换使用

Restricting swap usage for a process managed via systemd

我正在尝试使用 MemorySwapMax 限制进程的交换使用,如 doc 和 Ubuntu 18.04.

中所述

环境

ubuntu@vrni-platform:/usr/lib/systemd/system$ uname -a
Linux vrni-platform 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@vrni-platform:/usr/lib/systemd/system$ systemctl --version
systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid

我的 systemd 单元文件如下所示

[Unit]
Description=My service
After=network.target
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
RestartSec=1
User=support
MemoryMax=2000M
KillMode=process
MemoryAccounting=true
OOMScoreAdjust=1000
MemorySwapMax=0
ExecStart=/usr/bin/java -cp /home/support -XX:NativeMemoryTracking=summary -Xmx10000m MemoryConsumer 100 200 1

我试图通过为 MemorySwapMax 指定 0 来禁用此进程的交换。但似乎 systemd 中有一些 issue 已在 systemd 239 中修复。

所以我也尝试设置MemorySwapMax=1M。但这似乎也没有限制此 systemd 服务的交换内存使用。

MemorySwapMax 的文档说明了这一点

This setting is supported only if the unified control group hierarchy is used and disables MemoryLimit=.

因此,正如本文answer中提到的,我可以看到在我的设置中启用了 cgroup v2。

ubuntu@vrni-platform:/tmp/debraj$ sudo mount -t cgroup2 none /tmp/debraj
ubuntu@vrni-platform:/tmp/debraj$ ls -l /tmp/debraj/
total 0
-r--r--r--  1 root root 0 Jul  2 17:13 cgroup.controllers
-rw-r--r--  1 root root 0 Jul  2 17:13 cgroup.max.depth
-rw-r--r--  1 root root 0 Jul  2 17:13 cgroup.max.descendants
-rw-r--r--  1 root root 0 Jun 30 14:42 cgroup.procs
-r--r--r--  1 root root 0 Jul  2 17:13 cgroup.stat
-rw-r--r--  1 root root 0 Jul  2 17:13 cgroup.subtree_control
-rw-r--r--  1 root root 0 Jul  2 17:13 cgroup.threads
drwxr-xr-x  2 root root 0 Jun 30 14:42 init.scope
drwxr-xr-x 87 root root 0 Jul  2 15:05 system.slice
drwxr-xr-x  7 root root 0 Jun 30 15:22 user.slice
ubuntu@vrni-platform:/tmp/debraj$ sudo umount /tmp/debraj

MemoryConsumer.java 如下图

import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;

public class MemoryConsumer {
    public static void main(String[] args) throws InterruptedException, IOException {
        int size = Integer.parseInt(args[0]);
        int count = Integer.parseInt(args[1]);
        int sleepMs = Integer.parseInt(args[2]);
        List<ByteBuffer> list1 = new ArrayList<>();
        List<byte[]> list2 = new ArrayList<>();
        long start = System.currentTimeMillis();
        for (int i=0; i<count; i++) {
            list1.add(ByteBuffer.allocateDirect(size*1024*1024));
            //list2.add(new byte [size*1024*1024]);
            long end = System.currentTimeMillis();
            System.out.println("Allocated memory " + (i*size) + " MB\n" + (end-start) + " ms");
            Thread.sleep(sleepMs);
        }
    }
}

有人可以指出这里可能出了什么问题吗?

这已在 systemd 邮件列表中得到答复。

Link1

重新发布相关部分。

Looks like your Ubuntu version is using the "hybrid" cgroup mode by default. Cgroup v2 is indeed enabled in your kernel, but not necessarily in use – in the hybrid mode, systemd still mounts all resource controllers (cpu, memory, etc.) in v1 mode and only sets up its own process tracking in the v2 tree. See findmnt.

You could boot with the systemd.unified_cgroup_hierarchy=1 kernel option to switch everything to cgroups v2, but if you're using container software (docker, podman) make sure those are cgroups v2-compatible.

Link2

Hello Debraj.

On Thu, Jul 08, 2021 at 05:10:44PM +0530, Debraj Manna wrote:

Linux vrni-platform 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux [...] GRUB_CMDLINE_LINUX="audit=1 rootdelay=180 nousb net.ifnames=0 biosdevname=0 fsck.mode=force fsck.repair=yes ipv6.disable=1 systemd.unified_cgroup_hierarchy=1"

即使进行了这些更改,MemorySwapMax 仍未生效。

您还需要添加 swapaccount=1,swap accounting 由 默认仅自内核 v5.8.