Rootless buildkitd 在容器内抛出权限错误

Rootless buildkitd throws permission error inside container

我决定使用无根版本的 Buildkit 从 Kubernetes 的容器中构建 Docker 图像并将其推送到 GCR(Google 容器注册表)。

我偶然发现了这个错误:

/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: failed to mount /home/user/.local/tmp/buildkit-mount859701112: [{Type:bind Source:/home/user/.local/share/buildkit/runc-native/snapshots/snapshots/2 Options:[rbind ro]}]: operation not permitted

我是 运行 buildkitd 作为 deployment 链接到 buildkit documentation 指定的 service 这些资源 运行 在 Google Kubernetes Engine 上托管的 Kubernetes 集群中。

我正在使用以下 YAML 进行部署和服务

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: buildkitd
  name: buildkitd
spec:
  replicas: 1
  selector:
    matchLabels:
      app: buildkitd
  template:
    metadata:
      labels:
        app: buildkitd
      annotations:
        container.apparmor.security.beta.kubernetes.io/buildkitd: unconfined
        container.seccomp.security.alpha.kubernetes.io/buildkitd: unconfined
    spec:
      containers:
      - name: buildkitd
        image: moby/buildkit:master-rootless
        args:
        - --addr
        - unix:///run/user/1000/buildkit/buildkitd.sock
        - --addr
        - tcp://0.0.0.0:1234
        - --oci-worker-no-process-sandbox
        readinessProbe:
          exec:
            command:
            - buildctl
            - debug
            - workers
          initialDelaySeconds: 5
          periodSeconds: 30
        livenessProbe:
          exec:
            command:
            - buildctl
            - debug
            - workers
          initialDelaySeconds: 5
          periodSeconds: 30
        securityContext:
          runAsUser: 1000
          runAsGroup: 1000
        ports:
        - containerPort: 1234
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: buildkitd
  name: buildkitd
spec:
  ports:
  - port: 1234
    protocol: TCP
  selector:
    app: buildkitd

它与 buildkit documentation 没有 TLS 证书设置的相同。

然后我从另一个 Pod 使用以下命令联系 Buildkit 守护程序:

./bin/buildctl \
    --addr tcp://buildkitd:1234 \
    build \
    --frontend=dockerfile.v0 \
    --local context=. \
    --local dockerfile=. \
    --output type=image,name=eu.gcr.io/$PROJECT_ID/test-image,push=true

buildkitd 容器成功收到请求,但抛出上述错误。

buildctl 命令的输出如下:

#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.1s

#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 120B done
#2 DONE 0.1s
error: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: failed to mount /home/user/.local/tmp/buildkit-mount859701112: [{Type:bind Source:/home/user/.local/share/buildkit/runc-native/snapshots/snapshots/2 Options:[rbind ro]}]: operation not permitted

这是守护进程的错误。

让我印象深刻的是,我能够使用完全相同的 YAML 文件将 buildkitd 容器化到 minikube 集群中:

NAME                             READY   STATUS    RESTARTS   AGE
pod/buildkitd-5b46d94f5d-xvnbv   1/1     Running   0          36m

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/buildkitd    ClusterIP   10.100.72.194   <none>        1234/TCP   36m
service/kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP    36m

NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/buildkitd   1/1     1            1           36m

NAME                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/buildkitd-5b46d94f5d   1         1         1       36m

我在minikube里面部署服务和deployment,使用下面的命令转发服务端口就可以访问minikube外面的deployment。

kubectl port-forward service/buildkitd 2000:1234

通过该设置,我可以毫无问题地执行我的 buildctl 命令(图像构建并推送到 GCR)。

我想了解为什么它适用于 minikube 而不是 Google Kubernetes Engine。

如果有帮助的话,这里是容器启动日志

auto snapshotter: using native
NoProcessSandbox is enabled. Note that NoProcessSandbox allows build containers to kill (and potentially ptrace) an arbitrary process in the BuildKit host namespace. NoProcessSandbox should be enabled only when the BuildKit is running in a container as an unprivileged user.
found worker \"wdukby0uwmjyvf2ngj4e71s4m\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:buildkitd-5b46d94f5d-xvnbv org.mobyproject.buildkit.worker.snapshotter:native], platforms=[linux/amd64 linux/386]"
rootless mode is not supported for containerd workers. disabling containerd worker.
found 1 workers, default=\"wdukby0uwmjyvf2ngj4e71s4m\"
currently, only the default worker can be used.
TLS is not enabled for tcp://0.0.0.0:1234. enabling mutual TLS authentication is highly recommended
running server on /run/user/1000/buildkit/buildkitd.sock
running server on [::]:1234

Rootless 需要在主机上执行各种准备步骤(这需要在 VM 主机 运行 kubernetes 节点上的 Kubernetes 外部完成)。有关完整的步骤列表,请参阅 rootless documentation。请注意,这些步骤因 Linux 发行版而异,因为不同的发行版已经执行了部分或全部这些先决条件步骤。

Ubuntu

  • No preparation is needed.

  • overlay2 storage driver is enabled by default (Ubuntu-specific kernel patch).

  • Known to work on Ubuntu 16.04, 18.04, and 20.04.

Debian GNU/Linux

  • Add kernel.unprivileged_userns_clone=1 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl --system.

  • To use the overlay2 storage driver (recommended), run sudo modprobe overlay permit_mounts_in_userns=1 (Debian-specific kernel patch, introduced in Debian 10). Add the configuration to /etc/modprobe.d for persistence.

  • Known to work on Debian 9 and 10. overlay2 is only supported since Debian 10 and needs modprobe configuration described above.

Arch Linux

  • Add kernel.unprivileged_userns_clone=1 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl --system

openSUSE

  • sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter is required. This might be required on other distros as well depending on the configuration.

  • Known to work on openSUSE 15.

Fedora 31 and later

  • Fedora 31 uses cgroup v2 by default, which is not yet supported by the containerd runtime. Run sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0" to use cgroup v1.

  • You might need sudo dnf install -y iptables.

CentOS 8

  • You might need sudo dnf install -y iptables.

CentOS 7

  • Add user.max_user_namespaces=28633 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl --system.

  • systemctl --user does not work by default. Run the daemon directly without systemd: dockerd-rootless.sh --experimental --storage-driver vfs

  • Known to work on CentOS 7.7. Older releases require additional configuration steps.

  • CentOS 7.6 and older releases require COPR package vbatts/shadow-utils-newxidmap to be installed.

  • CentOS 7.5 and older releases require running sudo grubby --update-kernel=ALL --args="user_namespace.enable=1" and a reboot following this.

您可能正在点击:https://github.com/moby/buildkit/issues/879

请使用 GKE Ubuntu 个节点,而不是 Google Container-Optimized OS 个节点。