无法打开 uid_map 以从具有 cap_setuid 功能集的应用写入
Cannot open uid_map for writing from an app with cap_setuid capability set
在研究 user_namespaces(7) 中的示例时,我遇到了一个奇怪的行为。
应用程序的作用
应用程序 user-ns-ex
使用 CLONE_NEWUSER 调用 clone(2),从而在新用户命名空间中创建新进程。父进程将映射 (0 1000 1
) 写入 /proc//uid_map 文件并告诉(通过管道)子进程它可以继续。子进程然后执行 bash
.
我已经复制了源代码here。
问题
应用程序打开 /proc//uid_map 进行写入,如果我将其设置为无功能或所有功能。
当我仅设置 set_capuid、set_capgid 和可选的 cap_sys_admin 时,对 open(2) 的调用失败:
设置上限:
arksnote linux-namespaces # setcap 'cap_setuid,cap_setgid,cap_sys_admin=epi' ./user-ns-ex
arksnote linux-namespaces # getcap ./user-ns-ex
./user-ns-ex = cap_setgid,cap_setuid,cap_sys_admin+eip
尝试运行:
kamyshev@arksnote ~/workspace/personal/linux-kernel/linux-namespaces $ ./user-ns-ex -v -U -M '0 1000 1' bash
./user-ns-ex: PID of child created by clone() is 19666
ERROR: open /proc/19666/uid_map: Permission denied
About to exec bash
现在是一个成功的案例:
无能力:
arksnote linux-namespaces # setcap '=' ./user-ns-ex
arksnote linux-namespaces # getcap ./user-ns-ex
./user-ns-ex =
运行正常:
kamyshev@arksnote ~/workspace/personal/linux-kernel/linux-namespaces $ ./user-ns-ex -v -U -M '0 1000 1' bash
./user-ns-ex: PID of child created by clone() is 19557
About to exec bash
arksnote linux-namespaces # exit
我一直试图在手册页中找到原因并尝试使用不同的功能,但到目前为止还没有成功。最让我困惑的是,应用程序 运行 的功能较少而不是更多。
有人可以帮助我澄清问题吗?
研究
找到原因了。在我的研究过程中,我发现 uid_map
文件未打开,因为它的所有权更改为 root
.
非特权进程,无能力:
parent(m): capabilities: '='
parent(m): file /proc/4644/uid_map owner uid: 1000
parent(m): file /proc/4644/uid_map owner gid: 1000
非特权进程,能力已设置(cap_setuid=pe):
parent(m): capabilities: '= cap_setuid+ep'
parent(m): file /proc/4644/uid_map owner uid: 0
parent(m): file /proc/4644/uid_map owner gid: 0
ERROR: open /proc/4668/uid_map: Permission denied
以下研究使我想到了这个主题:what causes proc pid resources to become owned by root?
关于 "dumpable" 标志的规则
事情是这样的:
1) 当一个进程不可转储时,它的 /proc/<pid>
个 inode 被赋予 root 所有权:
// linux/base.c
struct inode *proc_pid_make_inode(struct super_block * sb, struct task_struct *task)
...
if (task_dumpable(task)) {
rcu_read_lock();
cred = __task_cred(task);
inode->i_uid = cred->euid;
inode->i_gid = cred->egid;
rcu_read_unlock();
}
2) 仅当其 "dumpable" 属性值为 1 (SUID_DUMP_USER) 时,该进程才可转储。参见 ptrace(2)。
3) prctl(2) 进一步清除情况:
Normally, this flag is set to 1. However, it is reset to the
current value contained in the file /proc/sys/fs/suid_dumpable
(which by default has the value 0), in the following
circumstances:
* The process's effective user or group ID is changed.
* The process's filesystem user or group ID is changed (see
credentials(7)).
* The process executes (execve(2)) a set-user-ID or set-
group-ID program, resulting in a change of either the
effective user ID or the effective group ID.
* The process executes (execve(2)) a program that has file
capabilities (see capabilities(7)), but only if the
permitted capabilities gained exceed those already
permitted for the process.
因此我的问题是由上述规则的最后一条引起的:
int commit_creds(struct cred *new)
<...>
/* dumpability changes */
if (!uid_eq(old->euid, new->euid) ||
!gid_eq(old->egid, new->egid) ||
!uid_eq(old->fsuid, new->fsuid) ||
!gid_eq(old->fsgid, new->fsgid) ||
!cred_cap_issubset(old, new)) {
if (task->mm)
set_dumpable(task->mm, suid_dumpable);
修复
有多种方法可以解决此问题:
- 全局更改
/proc/sys/fs/suid_dumpable
:
echo 1 > /proc/sys/fs/suid_dumpable
- 仅为进程设置 "dumpable" 标志:
prctl(PR_SET_DUMPABLE, 1, 0, 0, 0)
在研究 user_namespaces(7) 中的示例时,我遇到了一个奇怪的行为。
应用程序的作用
应用程序 user-ns-ex
使用 CLONE_NEWUSER 调用 clone(2),从而在新用户命名空间中创建新进程。父进程将映射 (0 1000 1
) 写入 /proc//uid_map 文件并告诉(通过管道)子进程它可以继续。子进程然后执行 bash
.
我已经复制了源代码here。
问题
应用程序打开 /proc//uid_map 进行写入,如果我将其设置为无功能或所有功能。
当我仅设置 set_capuid、set_capgid 和可选的 cap_sys_admin 时,对 open(2) 的调用失败:
设置上限:
arksnote linux-namespaces # setcap 'cap_setuid,cap_setgid,cap_sys_admin=epi' ./user-ns-ex
arksnote linux-namespaces # getcap ./user-ns-ex
./user-ns-ex = cap_setgid,cap_setuid,cap_sys_admin+eip
尝试运行:
kamyshev@arksnote ~/workspace/personal/linux-kernel/linux-namespaces $ ./user-ns-ex -v -U -M '0 1000 1' bash
./user-ns-ex: PID of child created by clone() is 19666
ERROR: open /proc/19666/uid_map: Permission denied
About to exec bash
现在是一个成功的案例:
无能力:
arksnote linux-namespaces # setcap '=' ./user-ns-ex
arksnote linux-namespaces # getcap ./user-ns-ex
./user-ns-ex =
运行正常:
kamyshev@arksnote ~/workspace/personal/linux-kernel/linux-namespaces $ ./user-ns-ex -v -U -M '0 1000 1' bash
./user-ns-ex: PID of child created by clone() is 19557
About to exec bash
arksnote linux-namespaces # exit
我一直试图在手册页中找到原因并尝试使用不同的功能,但到目前为止还没有成功。最让我困惑的是,应用程序 运行 的功能较少而不是更多。
有人可以帮助我澄清问题吗?
研究
找到原因了。在我的研究过程中,我发现 uid_map
文件未打开,因为它的所有权更改为 root
.
非特权进程,无能力:
parent(m): capabilities: '='
parent(m): file /proc/4644/uid_map owner uid: 1000
parent(m): file /proc/4644/uid_map owner gid: 1000
非特权进程,能力已设置(cap_setuid=pe):
parent(m): capabilities: '= cap_setuid+ep'
parent(m): file /proc/4644/uid_map owner uid: 0
parent(m): file /proc/4644/uid_map owner gid: 0
ERROR: open /proc/4668/uid_map: Permission denied
以下研究使我想到了这个主题:what causes proc pid resources to become owned by root?
关于 "dumpable" 标志的规则
事情是这样的:
1) 当一个进程不可转储时,它的 /proc/<pid>
个 inode 被赋予 root 所有权:
// linux/base.c
struct inode *proc_pid_make_inode(struct super_block * sb, struct task_struct *task)
...
if (task_dumpable(task)) {
rcu_read_lock();
cred = __task_cred(task);
inode->i_uid = cred->euid;
inode->i_gid = cred->egid;
rcu_read_unlock();
}
2) 仅当其 "dumpable" 属性值为 1 (SUID_DUMP_USER) 时,该进程才可转储。参见 ptrace(2)。
3) prctl(2) 进一步清除情况:
Normally, this flag is set to 1. However, it is reset to the current value contained in the file /proc/sys/fs/suid_dumpable (which by default has the value 0), in the following circumstances: * The process's effective user or group ID is changed. * The process's filesystem user or group ID is changed (see credentials(7)). * The process executes (execve(2)) a set-user-ID or set- group-ID program, resulting in a change of either the effective user ID or the effective group ID. * The process executes (execve(2)) a program that has file capabilities (see capabilities(7)), but only if the permitted capabilities gained exceed those already permitted for the process.
因此我的问题是由上述规则的最后一条引起的:
int commit_creds(struct cred *new)
<...>
/* dumpability changes */
if (!uid_eq(old->euid, new->euid) ||
!gid_eq(old->egid, new->egid) ||
!uid_eq(old->fsuid, new->fsuid) ||
!gid_eq(old->fsgid, new->fsgid) ||
!cred_cap_issubset(old, new)) {
if (task->mm)
set_dumpable(task->mm, suid_dumpable);
修复
有多种方法可以解决此问题:
- 全局更改
/proc/sys/fs/suid_dumpable
:
echo 1 > /proc/sys/fs/suid_dumpable
- 仅为进程设置 "dumpable" 标志:
prctl(PR_SET_DUMPABLE, 1, 0, 0, 0)