将挂载从子命名空间传播到父命名空间?

Propagate a mount from child namespace to the parent namespace?

如何将在子名称空间中创建的挂载传播到父名称空间?

详情

我正在尝试创建一个利用 overlayfs 的工具来允许对只读目录进行写入。棘手的一点是我希望任何用户都能够在没有 root 权限的情况下使用它。因此,我希望这可以通过挂载命名空间来实现,前提是管理员挂载了一个共享目录,然后任何用户都应该能够在该树下创建一个从父命名空间可见的覆盖(因此任何用户登录shell 可以看到 overlay mount)。

这是我尝试过的方法,但没有用:

# admin creates a shared tree for users to mount under
sudo mkdir /overlays
# bind mount over itself with MS_REC | MS_SHARED
sudo mount --bind --rshared /overlays /overlays

假设用户想要在 /some/readonly/dir 上创建覆盖,他们应该创建 /overlays/user/{upper,work,mnt}。我希望他们能够在使用以下代码传播的 /overlays 目录下安装覆盖。

// user_overlay.c
#define _GNU_SOURCE                                                                                                                                                                                          
#include <sched.h>                                                                                                                                                                                           

#include <stdio.h>                                                                                                                                                                                           
#include <stdlib.h>                                                                                                                                                                                          
#include <signal.h>                                                                                                                                                                                          
#include <linux/capability.h>                                                                                                                                                                                
#include <sys/mount.h>                                                                                                                                                                                       
#include <sys/types.h>                                                                                                                                                                                       
#include <sys/wait.h>                                                                                                                                                                                        
#include <unistd.h>                                                                                                                                                                                          

int child(void *args)                                                                                                                                                                                        
{                                                                                                                                                                                                            
    pid_t p;                                                                                                                               
    p = mount("overlay", "/overlays/user/mnt", "overlay", 0, "lowerdir=/some/readonly/dir,upperdir=/overlays/user/upper,workdir=/overlays/user/work");                                                                           
    if (p == -1){                                                                                                                                                                                            
        perror("Failed to mount overlay");                                                                                                                                                                                     
        exit(1);                                                                                                                                                                                             
    }                                                                                                                                                                                                        

    // Expose the mount to the parent namespace                                                                                                                                                              
    p = mount("none", "/overlays/user/mnt", NULL, MS_SHARED, NULL);                                                                                                                                                 
    if (p == -1){                                                                                                                                                                                            
        perror("Failed to mark mount as shared");                                                                                                                                                                                     
        exit(1);                                                                                                                                                                                             
    }                                                                                                                                                                                                        

    // Exec bash so I can ensure that the mnt was created
    // though in practice I would just daemonize this proc
    // such that the mount is visible in the parent 
    // until this proc is killed
    char *newargv[] = { "/bin/bash", NULL };                                                                                                                                                                 

    execv("/bin/bash", newargv);                                                                                                                                                                             
    perror("exec");                                                                                                                                                                                          
    exit(EXIT_FAILURE);                                                                                                                                                                                      

    return 0;                                                                                                                                                                                                
}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
int main()                                                                                                                                                                                                   
{                                                                                                                                                                                                            
    pid_t p = clone(child, malloc(4096) + 4096, CLONE_NEWNS | CLONE_NEWUSER | SIGCHLD, NULL);                                                                                                                
    if (p == -1) {                                                                                                                                                                                           
        perror("clone");                                                                                                                                                                                     
        exit(1);                                                                                                                                                                                             
    }                                                                                                                                                                                                        

    // Wait until the bash proc in the child finishes
    waitpid(p, NULL, 0);                                                                                                                                                                                     
    return 0;                          
}

执行 gcc user_overlay.c -o user_overlay && ./user_overlay 确实会在该子进程中安装覆盖,但 /overlays/user/mnt 不会传播到父进程。但是,对 /overlays/user/upper 的修改对于父项和子项都是可见的。

你试图实现的目标似乎是不可能的,至少不能使用上述方法。你想通过 CLONE_NEWUSER 创建一个新的用户命名空间来授予非特权用户安装权限。但是,引用 mount_namespaces(7)(强调我的):

Restrictions on mount namespaces
Note the following points with respect to mount namespaces:
* A mount namespace has an owner user namespace. A mount namespace whose owner user namespace is different from the owner user namespace of its parent mount namespace is considered a less privileged mount namespace.

* When creating a less privileged mount namespace, shared mounts are reduced to slave mounts. (Shared and slave mounts are discussed below.) This ensures that mappings performed in less privileged mount namespaces will not propagate to more privileged mount namespaces.

这意味着您创建的坐骑实际上具有 slave 传播类型,而不是您期望的 shared。这会导致挂载事件不会传播到父挂载命名空间。