为什么要授予 Docker 容器 "bad" 的 SYS_ADMIN 权限?
Why is granting the SYS_ADMIN privilege for a Docker container "bad"?
我 运行 遇到安全团队的问题,因为工程团队想在 Docker 中 FUSE 挂载文件系统,但是,要做到这一点,“--cap-add SYS_ADMIN" 必须设置标志。安全不允许此标志。
我在 Internet 上找到了很多关于 Docker 运行时的“--cap-add SYS_ADMIN”标志的文章,因为 "SYS_ADMIN by itself grants quite a big part of the capabilities and it could potentially present more attack surface."
但是,我找不到任何具体说明这些功能是什么以及它们 "attack surfaces" 它们呈现什么的内容?
SYS_ADMIN 标志究竟授予什么?
设置此标志会带来哪些实际安全风险?
这基本上是对主机的根访问。来自功能手册页:
CAP_SYS_ADMIN
Note: this capability is overloaded; see Notes to kernel
developers, below.
* Perform a range of system administration operations
including: quotactl(2), mount(2), umount(2), pivot_root(2),
setdomainname(2);
* perform privileged syslog(2) operations (since Linux 2.6.37,
CAP_SYSLOG should be used to permit such operations);
* perform VM86_REQUEST_IRQ vm86(2) command;
* perform IPC_SET and IPC_RMID operations on arbitrary System
V IPC objects;
* override RLIMIT_NPROC resource limit;
* perform operations on trusted and security Extended
Attributes (see xattr(7));
* use lookup_dcookie(2);
* use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before
Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes;
* forge PID when passing socket credentials via UNIX domain
sockets;
* exceed /proc/sys/fs/file-max, the system-wide limit on the
number of open files, in system calls that open files (e.g.,
accept(2), execve(2), open(2), pipe(2));
* employ CLONE_* flags that create new namespaces with
clone(2) and unshare(2) (but, since Linux 3.8, creating user
namespaces does not require any capability);
* call perf_event_open(2);
* access privileged perf event information;
* call setns(2) (requires CAP_SYS_ADMIN in the target
namespace);
* call fanotify_init(2);
* call bpf(2);
* perform privileged KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2)
operations;
* perform madvise(2) MADV_HWPOISON operation;
* employ the TIOCSTI ioctl(2) to insert characters into the
input queue of a terminal other than the caller's
controlling terminal;
* employ the obsolete nfsservctl(2) system call;
* employ the obsolete bdflush(2) system call;
* perform various privileged block-device ioctl(2) operations;
* perform various privileged filesystem ioctl(2) operations;
* perform privileged ioctl(2) operations on the /dev/random
device (see random(4));
* install a seccomp(2) filter without first having to set the
no_new_privs thread attribute;
* modify allow/deny rules for device control groups;
* employ the ptrace(2) PTRACE_SECCOMP_GET_FILTER operation to
dump tracee's seccomp filters;
* employ the ptrace(2) PTRACE_SETOPTIONS operation to suspend
the tracee's seccomp protections (i.e., the
PTRACE_O_SUSPEND_SECCOMP flag);
* perform administrative operations on many device drivers.
* Modify autogroup nice values by writing to
/proc/[pid]/autogroup (see sched(7)).
我 运行 遇到安全团队的问题,因为工程团队想在 Docker 中 FUSE 挂载文件系统,但是,要做到这一点,“--cap-add SYS_ADMIN" 必须设置标志。安全不允许此标志。
我在 Internet 上找到了很多关于 Docker 运行时的“--cap-add SYS_ADMIN”标志的文章,因为 "SYS_ADMIN by itself grants quite a big part of the capabilities and it could potentially present more attack surface."
但是,我找不到任何具体说明这些功能是什么以及它们 "attack surfaces" 它们呈现什么的内容?
SYS_ADMIN 标志究竟授予什么?
设置此标志会带来哪些实际安全风险?
这基本上是对主机的根访问。来自功能手册页:
CAP_SYS_ADMIN Note: this capability is overloaded; see Notes to kernel developers, below.
* Perform a range of system administration operations including: quotactl(2), mount(2), umount(2), pivot_root(2), setdomainname(2); * perform privileged syslog(2) operations (since Linux 2.6.37, CAP_SYSLOG should be used to permit such operations); * perform VM86_REQUEST_IRQ vm86(2) command; * perform IPC_SET and IPC_RMID operations on arbitrary System V IPC objects; * override RLIMIT_NPROC resource limit; * perform operations on trusted and security Extended Attributes (see xattr(7)); * use lookup_dcookie(2); * use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes; * forge PID when passing socket credentials via UNIX domain sockets; * exceed /proc/sys/fs/file-max, the system-wide limit on the number of open files, in system calls that open files (e.g., accept(2), execve(2), open(2), pipe(2)); * employ CLONE_* flags that create new namespaces with clone(2) and unshare(2) (but, since Linux 3.8, creating user namespaces does not require any capability); * call perf_event_open(2); * access privileged perf event information; * call setns(2) (requires CAP_SYS_ADMIN in the target namespace); * call fanotify_init(2); * call bpf(2); * perform privileged KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2) operations; * perform madvise(2) MADV_HWPOISON operation; * employ the TIOCSTI ioctl(2) to insert characters into the input queue of a terminal other than the caller's controlling terminal; * employ the obsolete nfsservctl(2) system call; * employ the obsolete bdflush(2) system call; * perform various privileged block-device ioctl(2) operations; * perform various privileged filesystem ioctl(2) operations; * perform privileged ioctl(2) operations on the /dev/random device (see random(4)); * install a seccomp(2) filter without first having to set the no_new_privs thread attribute; * modify allow/deny rules for device control groups; * employ the ptrace(2) PTRACE_SECCOMP_GET_FILTER operation to dump tracee's seccomp filters; * employ the ptrace(2) PTRACE_SETOPTIONS operation to suspend the tracee's seccomp protections (i.e., the PTRACE_O_SUSPEND_SECCOMP flag); * perform administrative operations on many device drivers. * Modify autogroup nice values by writing to /proc/[pid]/autogroup (see sched(7)).