为什么 SIGHUP 不能在 Alpine Docker 容器中的 busybox sh 上工作?

Why does SIGHUP not work on busybox sh in an Alpine Docker container?

发送SIGHUP
kill -HUP <pid>

到我的本机系统上的 busybox sh 进程按预期工作,shell 挂断。但是,如果我使用 docker kill 将信号发送到带有

的容器
docker kill -s HUP <container>

它什么也没做。 Alpine 容器仍然是 运行:

$ CONTAINER=$(docker run -dt alpine:latest)
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 1 second
$ docker kill -s HUP $CONTAINER
4fea4f2dabe0f8a717b0e1272528af1a97050bcec51babbe0ed801e75fb15f1b
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 7 seconds

顺便说一下,使用 Ubuntu 容器(运行 bash)它确实按预期工作:

$ CONTAINER=$(docker run -dt debian:latest)
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 1 second
$ docker kill -s HUP $CONTAINER
9a4aff456716397527cd87492066230e5088fbbb2a1bb6fc80f04f01b3368986
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Exited (129) 1 second ago

发送 SIGKILL 确实有效,但我更愿意找出为什么 SIGHUP 无效。


更新: 我将添加另一个示例。在这里你可以看到 busybox sh 通常会在 SIGHUP 上成功挂断:

$ busybox sh -c 'while true; do sleep 10; done' &
[1] 28276
$ PID=$!
$ ps -e | grep busybox
28276 pts/5    00:00:00 busybox
$ kill -HUP $PID
$ 
[1]+  Hangup                  busybox sh -c 'while true; do sleep 10; done'
$ ps -e | grep busybox
$

但是,运行 docker 容器内的相同无限睡眠循环不会退出。可以看到,容器在SIGHUP之后还是运行,只有在SIGKILL:

之后才退出
$ CONTAINER=$(docker run -dt alpine:latest busybox sh -c 'while true; do sleep 10; done')
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}" 
Up 14 seconds
$ docker kill -s HUP $CONTAINER
31574ba7c0eb0505b776c459b55ffc8137042e1ce0562a3cf9aac80bfe8f65a0
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 28 seconds
$ docker kill -s KILL $CONTAINER
31574ba7c0eb0505b776c459b55ffc8137042e1ce0562a3cf9aac80bfe8f65a0
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Exited (137) 2 seconds ago
$

(我手头没有 Docker 环境可以试一试。只是猜测。)

对于您的情况,docker run 必须是 运行 busybox/shbash 作为 PID 1.

根据Docker doc:

Note: A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.

对于busybox/shbash关于SIGHUP的区别---

在我的系统(Debian 9.6,x86_64)上,busybox/shbash 的信号掩码如下:

busybox/sh:

USER     PID %CPU %MEM    VSZ   RSS TTY    STAT START   TIME COMMAND
root   82817  0.0  0.0   6952  1904 pts/2  S+   10:23   0:00 busybox sh

PENDING (0000000000000000):
BLOCKED (0000000000000000):
IGNORED (0000000000284004):
   3 QUIT
  15 TERM
  20 TSTP
  22 TTOU
CAUGHT (0000000008000002):
   2 INT
  28 WINCH

bash:

USER    PID %CPU %MEM    VSZ   RSS TTY     STAT START   TIME COMMAND
root   4871  0.0  0.1  21752  6176 pts/16  Ss    2019   0:00 /usr/local/bin/bash

PENDING (0000000000000000):
BLOCKED (0000000000000000):
IGNORED (0000000000380004):
   3 QUIT
  20 TSTP
  21 TTIN
  22 TTOU
CAUGHT (000000004b817efb):
   1 HUP
   2 INT
   4 ILL
   5 TRAP
   6 ABRT
   7 BUS
   8 FPE
  10 USR1
  11 SEGV
  12 USR2
  13 PIPE
  14 ALRM
  15 TERM
  17 CHLD
  24 XCPU
  25 XFSZ
  26 VTALRM
  28 WINCH
  31 SYS

正如我们所看到的,busybox/sh 没有处理 SIGHUP 所以信号被忽略了。 Bash 捕获 SIGHUP 因此 docker kill 可以将信号传递给 Bash 然后 Bash 将被终止,因为,根据其 manual“shell 在收到 SIGHUP”后默认退出“=76=”。


更新 2020-03-07 #1:

快速测试了一下,我之前的分析基本正确。你可以这样验证:

[STEP 104] # docker run -dt debian busybox sh -c \
             'trap exit HUP; while true; do sleep 1; done'
331380090c59018dae4dbc17dd5af9d355260057fdbd2f2ce9fc6548a39df1db
[STEP 105] # docker ps 
CONTAINER ID        IMAGE            COMMAND                  CREATED             
331380090c59        debian           "busybox sh -c 'trap…"   11 seconds ago      
[STEP 106] # docker kill -s HUP 331380090c59    
331380090c59
[STEP 107] # docker ps 
CONTAINER ID        IMAGE               COMMAND             CREATED             
[STEP 108] #

正如我之前展示的,默认情况下 busybox/sh 不会捕捉到 SIGHUP 因此信号将被忽略。但是在busybox/sh显式陷阱SIGHUP之后,信号会被传递给它。

我也试过 SIGKILL 是的,它总是会终止 运行 容器。这是合理的,因为 SIGKILL 不能被任何进程捕获,所以信号将始终传递到容器并杀死它。


更新 2020-03-07 #2:

你也可以这样验证(更简单):

[STEP 110] # docker run -ti alpine
/ # ps
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
    7 root      0:00 ps
/ # kill -HUP 1    <-- this does not kill it because linux ignored the signal
/ # 
/ # trap 'echo received SIGHUP' HUP
/ # kill -HUP 1
received SIGHUP    <-- this indicates it can receive SIGHUP now
/ # 
/ # trap exit HUP
/ # kill -HUP 1    <-- this terminates it because the action changed to `exit`
[STEP 111] #

就像其他答案已经指出的那样,docker run 的文档包含以下注释:

Note: A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.

这就是 SIGHUP 在容器内的 busybox sh 上不起作用的原因。但是,如果我 运行 busybox sh 在我的本地系统上,它不会有 PID 1,因此 SIGHUP 可以工作。

有多种解决方案:

  • 使用--init指定一个初始化进程,它应该被用作PID 1。

    You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.

    The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.

  • 陷阱SIGHUP然后自己打电话给exit

    docker run -dt alpine busybox sh -c 'trap exit HUP ; while true ; do sleep 60 & wait $! ; done'
    
  • 使用另一个 shell,例如 bash,它默认在 SIGHUP 退出,PID 是否为 1 无关紧要。