br_netfilter 在 ubuntu 20.04 上部署 docker 容器时出错

br_netfilter error when deploying docker containers to swarm on ubuntu 20.04

我一直在努力将我的容器部署到 Ubuntu 服务器 20.04 上的 Docker swarm。 我正在尝试在单个 VPS 主机上使用 Docker swarm 进行零停机部署。

运行 容器 docker-compose 一切正常。

现在尝试将相同的 docker-compose 文件部署到 docker swarm。

# docker swarm init
Swarm initialized: current node (wlshyv0s1n5c85mao8jt9wo5j) is now a manager.

To add a worker to this swarm, run the following command:
...

# docker stack deploy --compose-file docker-compose.yml dash
Ignoring unsupported options: build

Creating network dash_default
Creating service dash_db
Creating service dash_nginx
...

完成部署命令后,docker ps 我看到没有 运行 个容器。

现在检查 docker ps -a 我看到很多容器,它们的所有状态都显示“已创建”。

接下来,当我检查一个容器时,它的状态显示:

"State": {
    "Status": "created",
    "Running": false,
    "Paused": false,
    "Restarting": false,
    "OOMKilled": false,
    "Dead": false,
    "Pid": 0,
    "ExitCode": 128,
    "Error": "error creating external connectivity network: cannot restrict inter-container communication: please ensure that br_netfilter kernel module is loaded",
    "StartedAt": "0001-01-01T00:00:00Z",
    "FinishedAt": "0001-01-01T00:00:00Z"
}

正在检查加载的模块:

# lsmod | grep br_netfilter
br_netfilter            4242  -2
bridge                  4242  -2 br_netfilter,ebtable_broute

在 运行 docker info 之后我看到了 2 个警告:

# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 40
 Server Version: 20.10.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: ij25ein3xvcr8p5ky765ol8t0
  Is Manager: true
  ClusterID: mdb2r7vnngw62lg8uoj5ef55k
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.0.0.0/8
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  ...
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.0
 Operating System: Ubuntu 20.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 4GiB
 ...
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

寻找解决方案我发现我应该调用 sysctl 命令,但我仍然得到错误。

# sysctl net.bridge.bridge-nf-call-ip6tables=1
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory

现在搜索解决方案,我找到了下一个命令,但效果不佳。

# modprobe br_netfilter
modprobe: FATAL: Module br_netfilter not found in directory /lib/modules/5.4.0

我不知道该怎么做才能让 swarm 工作。

在我的 Windows 机器上使用 swarm 模式一切正常。

关于我下一步应该做什么有什么建议吗?do/check

该问题出在托管服务提供商中。

提供商告诉我们,其他客户也尝试在他们的 VPS 上配置 Docker Swarm,但没有人知道如何让它工作。

提供者不允许在较低级别上进行任何内核修改或任何其他操作。

现在我们正在使用另一个托管服务提供商,一切正常。