如何使用带有 CLONE_NEWUSER 标志的克隆系统调用测试用户命名空间

How to test user namespace with clone system call with CLONE_NEWUSER flag

正在测试 sample from Containerization with LXC 以演示用户命名空间。

应该打印新用户命名空间中子进程的输出和父进程的输出。

# ./user_namespace
UID outside the namespace is 0
GID outside the namespace is 0
UID inside the namespace is 65534
GID inside the namespace is 65534

但是,它只显示父输出。

UID outside the namespace is 1000
GID outside the namespace is 1000

请帮助理解为什么子进程不打印。

Code

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sched.h>
#include <signal.h>

static int childFunc(void *arg)
{
    printf("UID inside the namespace is %ld\n", (long)geteuid());
    printf("GID inside the namespace is %ld\n", (long)getegid());
}

static char child_stack[1024*1024];

int main(int argc, char *argv[])
{
    pid_t child_pid;

    /* child_pid = clone(childFunc, child_stack + (1024*1024), CLONE_NEWUSER, 0);*/

    child_pid = clone(&childFunc, child_stack + (1024*1024), CLONE_NEWUSER, 0);

    printf("UID outside the namespace is %ld\n", (long)geteuid());
    printf("GID outside the namespace is %ld\n", (long)getegid());
    waitpid(child_pid, NULL, 0);
    exit(EXIT_SUCCESS);
}

环境

$ uname -r
3.10.0-693.21.1.el7.x86_64

$ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
CPE_NAME="cpe:/o:centos:centos:7"

参考资料


更新

根据thejonny的回答,是开启用户命名空间。对于 RHEL/CentOS 7,Is it safe to enable user namespaces in CentOS 7.4 and how to do it?

By default, the new 7.4 kernel restricts the number of user namespaces to 0. To work around this, increase the user namespace limit:
echo 15000 > /proc/sys/user/max_user_namespaces

非特权用户命名空间可能被禁用。由于您不检查 clone 的 return 值,因此您不会注意到。 运行 通过我系统上的 strace 打印:

.... startup stuff ...
clone(child_stack=0x55b41f2a4070, flags=CLONE_NEWUSER) = -1 EPERM (Operation not permitted)
geteuid()                               = 1000
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 6), ...}) = 0
brk(NULL)                               = 0x55b4200b8000
brk(0x55b4200d9000)                     = 0x55b4200d9000
write(1, "UID outside the namespace is 100"..., 34UID outside the namespace is 1000
) = 34
getegid()                               = 1000
write(1, "GID outside the namespace is 100"..., 34GID outside the namespace is 1000
) = 34
wait4(-1, NULL, 0, NULL)                = -1 ECHILD (No child processes)
exit_group(0)   = ?

因此 clone 和 waitpid 失败,没有子进程。

查看此处启用用户权限:https://superuser.com/questions/1094597/enable-user-namespaces-in-debian-kernel