非特权 docker 容器可以从内部暂停吗?

Can an unprivileged docker container be paused from the inside?

有没有一种简单的方法可以从内部完全暂停非特权 docker 容器,同时保留从外部 unpause/exec 它的能力?

docker pause 来自内部的命令对于非特权容器是不可能的。它需要通过安装套接字来访问 docker 守护进程。

您需要构建自定义解决方案。只是基本的想法:您可以从主机绑定一个文件夹。在此文件夹中,您创建一个文件作为锁。因此,当您在容器内暂停时,您将创建文件。当文件存在时你主动 wait/sleep。一旦主机删除挂载路径中的文件,您的代码就会恢复。这是一种相当幼稚的方法,因为您会主动等待,但它应该可以解决问题。

您还可以查看 inotify 来克服激活等待。 https://lwn.net/Articles/604686/

TL;DR;

在linux容器上,答案肯定是,因为这两个是等价的:

  • 来自主持人:
    docker pause [container-id]
    
  • 来自容器:
    kill -SIGSTOP [process(es)-id]
    
    或者,even shorter
    kill -SIGSTOP -1 
    
    注意
    1. 如果您的进程 ID,或 PID1,那么您属于边缘情况,因为 PID 1,init 进程,确实有特定的含义 and behaviour 在 Linux.
    2. 某些进程可能会生成子工作者,如下面的 NGINX 示例。

这两个也是等价的:

  • 来自主持人:
    docker unpause [container-id]
    
  • 来自容器:
    kill -SIGCONT [process(es)-id]
    
    或者,even shorter
    kill -SIGCONT -1 
    

还要注意,在某些边缘情况下,这将不起作用。边缘情况是您的进程旨在捕获这两个信号,SIGSTOPSIGCONT,并忽略它们。

在这些情况下,您将不得不

  • 或者,是特权用户,因为the cgroup freezer是在文件夹下使用的,默认情况下, read only in Docker, 这可能会让你陷入死胡同,因为你将无法再跳入容器。
  • , 运行 你的容器带有标志 --init 所以 PID 1 会只是一个由 Docker 初始化的包装进程,您将不再需要暂停它以暂停容器内的进程 运行ning。

    You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.

    The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.


这对于 Linux 容器来说绝对是可能的,并且在文档中以某种方式进行了解释,他们指出 运行ning docker pause [container-id] 只是意味着 Docker 将使用等效机制将 SIGSTOP 信号发送到容器中的进程 运行。

The docker pause command suspends all processes in the specified containers. On Linux, this uses the freezer cgroup. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the freezer cgroup the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed. On Windows, only Hyper-V containers can be paused.

See the freezer cgroup documentation for further details.

来源:https://docs.docker.com/engine/reference/commandline/pause/

这里是一个关于 NGINX Alpine 容器的例子:

### For now, we are on the host machine
$ docker run -p 8080:80 -d nginx:alpine
f444eaf8464e30c18f7f83bb0d1bd07b48d0d99f9d9e588b2bd77659db520524

### Testing if NGINX answers, successful 
$ curl -I -m 1  http://localhost:8080/ 
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:49:33 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes

### Jumping into the container
$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash

### This is the container, now, let's see the processes running
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID   VSZ  RSS  TT     STAT TIME  RUSER    COMMAND
    1 6000 4536 ?      S     0:00 root     nginx: master process nginx -g daemon off;
   29 6440 1828 ?      S     0:00 nginx    nginx: worker process
   30 6440 1828 ?      S     0:00 nginx    nginx: worker process
   31 6440 1828 ?      S     0:00 nginx    nginx: worker process
   32 6440 1828 ?      S     0:00 nginx    nginx: worker process
   49 1648 1052 136,0  S     0:00 root     ash
   55 1576    4 136,0  R     0:00 root     ps -o pid,vsz,rss,tty,stat,time,ruser,args

### Now let's send the SIGSTOP signal to the workers of NGINX, as docker pause would do
/ # kill -SIGSTOP 29 30 31 32

### Running ps again just to observer the T (stopped) state of the processes
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID   VSZ  RSS  TT     STAT TIME  RUSER    COMMAND
    1 6000 4536 ?      S     0:00 root     nginx: master process nginx -g daemon off;
   29 6440 1828 ?      T     0:00 nginx    nginx: worker process
   30 6440 1828 ?      T     0:00 nginx    nginx: worker process
   31 6440 1828 ?      T     0:00 nginx    nginx: worker process
   32 6440 1828 ?      T     0:00 nginx    nginx: worker process
   57 1648 1052 136,0  S     0:00 root     ash
   63 1576    4 136,0  R     0:00 root     ps -o pid,vsz,rss,tty,stat,time,ruser,args
/ # exit

### Back on the host to confirm NGINX doesn't answer anymore
$ curl -I -m 1  http://localhost:8080/ 
curl: (28) Operation timed out after 1000 milliseconds with 0 bytes received

$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash

### Sending the SIGCONT signal as docker unpause would do
/ # kill -SIGCONT 29 30 31 32
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID   VSZ  RSS  TT     STAT TIME  RUSER    COMMAND
    1 6000 4536 ?      S     0:00 root     nginx: master process nginx -g daemon off;
   29 6440 1828 ?      S     0:00 nginx    nginx: worker process
   30 6440 1828 ?      S     0:00 nginx    nginx: worker process
   31 6440 1828 ?      S     0:00 nginx    nginx: worker process
   32 6440 1828 ?      S     0:00 nginx    nginx: worker process
   57 1648 1052 136,0  S     0:00 root     ash
   62 1576    4 136,0  R     0:00 root     ps -o pid,vsz,rss,tty,stat,time,ruser,args 29 30 31 32
/ # exit

### Back on the host to confirm NGINX is back
$ curl -I http://localhost:8080/       
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:56:23 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes

对于有意义的进程是 PID 1is protected by the Linux kernel 的情况,您可能想在容器的 运行 处尝试 --init 标志所以 Docker 将创建一个包装进程,该进程将能够将信号传递给您的应用程序。

$ docker run -p 8080:80 -d --init nginx:alpine                                        
e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9
$ docker exec -ti e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9 ash

/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID   VSZ  RSS  TT     STAT TIME  RUSER    COMMAND
    1 1052    4 ?      S     0:00 root     /sbin/docker-init -- /docker-entrypoint.sh nginx -g daemon off;
    7 6000 4320 ?      S     0:00 root     nginx: master process nginx -g daemon off;
   31 6440 1820 ?      S     0:00 nginx    nginx: worker process
   32 6440 1820 ?      S     0:00 nginx    nginx: worker process
   33 6440 1820 ?      S     0:00 nginx    nginx: worker process
   34 6440 1820 ?      S     0:00 nginx    nginx: worker process
   35 1648    4 136,0  S     0:00 root     ash
   40 1576    4 136,0  R     0:00 root     ps -o pid,vsz,rss,tty,stat,time,ruser,args

看看以前用例中 PID 1nginx: master process nginx -g daemon off; 现在是如何变成 PID 7 的?
这使我们能够 kill -SIGSTOP -1 并确保停止所有有意义的进程,但我们不会被锁定在容器之外。


在深入研究时,我发现这个博客 post 似乎是关于该主题的好读物:https://major.io/2009/06/15/two-great-signals-sigstop-and-sigcont/

还与 ps 有关进程状态代码的手册页摘录相关:

  Here are the different values that the s, stat and state output
  specifiers (header "STAT" or "S") will display to describe the state
  of a process:

          D    uninterruptible sleep (usually IO)
          I    Idle kernel thread
          R    running or runnable (on run queue)
          S    interruptible sleep (waiting for an event to complete)
          T    stopped by job control signal
          t    stopped by debugger during the tracing
          W    paging (not valid since the 2.6.xx kernel)
          X    dead (should never be seen)
          Z    defunct ("zombie") process, terminated but not reaped by
               its parent

  For BSD formats and when the stat keyword is used, additional
  characters may be displayed:

          <    high-priority (not nice to other users)
          N    low-priority (nice to other users)
          L    has pages locked into memory (for real-time and custom
               IO)
          s    is a session leader
          l    is multi-threaded (using CLONE_THREAD, like NPTL
               pthreads do)
          +    is in the foreground process group

来源https://man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE_CODES