Docker swarm worker 节点无法为其托管的 nginx 服务提供服务

Docker swarm worker node cannot serve the nginx service it is hosting

作为一项学习练习,我尝试在两个测试 AWS EC2 实例上设置一个 docker 群,但是当我尝试访问来自工作节点 IP 地址的服务。

在主服务器上,我运行 docker swarm init。然后我拿了输出令牌和 运行 docker swarm join --token <token> <Master Private IP>:2377

然后我在master上做了一个简单的docker service create -p 80:80 --name nginx nginx,然后是docker service scale nginx=2。现在,检查 docker service ps nginx 给出以下内容:

ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
idux5dftj9oj        nginx.1             nginx:latest        ip-172-31-13-2      Running             Running 12 minutes ago                       
2nwfw3fncybj        nginx.2             nginx:latest        ip-172-31-14-130    Running             Running 38 seconds ago

我已经在安全组 according to this guide 上打开了入站端口,特别是:

master和worker的安全组是一样的,所以我就把source设置成自己了。

当我 运行 curl http://localhost 在 master 上时,它给了我这个,证明它有效:

<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
<!-- Omitting this for brevity -->
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<!-- Omitting this for brevity -->
</body>

但是在 worker 上,我只得到 curl: (7) Failed to connect to localhost port 80: Connection refused

工人的 docker ps 给我:

CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                    NAMES
b37770b153db        nginx:latest                    "nginx -g 'daemon of…"   34 minutes ago      Up 34 minutes       80/tcp                   nginx.2.2nwfw3fncybjj7qzeierlx0xr

运行 docker service inspect nginx 上师给出:

[
    {
        "ID": "887xm47oavn367w0o4bo1nmce",
        "Version": {
            "Index": 652
        },
        "CreatedAt": "2019-05-19T07:50:54.491113206Z",
        "UpdatedAt": "2019-05-19T08:02:53.454804111Z",
        "Spec": {
            "Name": "nginx",
            "Labels": {},
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "nginx:latest@sha256:23b4dcdf0d34d4a129755fc6f52e1c6e23bb34ea011b315d87e193033bcd1b68",
                    "Init": false,
                    "StopGracePeriod": 10000000000,
                    "DNSConfig": {},
                    "Isolation": "default"
                },
                "Resources": {
                    "Limits": {},
                    "Reservations": {}
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "Delay": 5000000000,
                    "MaxAttempts": 0
                },
                "Placement": {
                    "Platforms": [
                        {
                            "Architecture": "amd64",
                            "OS": "linux"
                        },
                        {
                            "OS": "linux"
                        },
                        {
                            "Architecture": "arm64",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "386",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "ppc64le",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "s390x",
                            "OS": "linux"
                        }
                    ]
                },
                "ForceUpdate": 0,
                "Runtime": "container"
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 2
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "RollbackConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 80,
                        "PublishedPort": 80,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "PreviousSpec": {
            "Name": "nginx",
            "Labels": {},
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "nginx:latest@sha256:23b4dcdf0d34d4a129755fc6f52e1c6e23bb34ea011b315d87e193033bcd1b68",
                    "Init": false,
                    "DNSConfig": {},
                    "Isolation": "default"
                },
                "Resources": {
                    "Limits": {},
                    "Reservations": {}
                },
                "Placement": {
                    "Platforms": [
                        {
                            "Architecture": "amd64",
                            "OS": "linux"
                        },
                        {
                            "OS": "linux"
                        },
                        {
                            "Architecture": "arm64",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "386",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "ppc64le",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "s390x",
                            "OS": "linux"
                        }
                    ]
                },
                "ForceUpdate": 0,
                "Runtime": "container"
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 1
                }
            },
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 80,
                        "PublishedPort": 80,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "Endpoint": {
            "Spec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 80,
                        "PublishedPort": 80,
                        "PublishMode": "ingress"
                    }
                ]
            },
            "Ports": [
                {
                    "Protocol": "tcp",
                    "TargetPort": 80,
                    "PublishedPort": 80,
                    "PublishMode": "ingress"
                }
            ],
            "VirtualIPs": [
                {
                    "NetworkID": "6scdvoeno2tviu4zgyldmq6b4",
                    "Addr": "10.255.0.82/16"
                }
            ]
        }
    }
]

这里是大师的docker info

Containers: 3
 Running: 3
 Paused: 0
 Stopped: 0
Images: 4
Server Version: 18.09.6
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: q4h5ahgxf1xwuyi2aotyt20iy
 Is Manager: true
 ClusterID: r88oqh59x74bl1kqrcg5od2qd
 Managers: 1
 Nodes: 2
 Default Address Pool: 10.0.0.0/8  
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 172.31.13.2
 Manager Addresses:
  172.31.13.2:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-1021-aws
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.945GiB
Name: ip-172-31-13-2
ID: RM34:I2IM:EJ2V:W74X:ECSD:ABCC:ZB4T:B7UO:OIWW:SUQ2:ILDB:HQLQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

这是工人的 docker info

Containers: 3
 Running: 3
 Paused: 0
 Stopped: 0
Images: 4
Server Version: 18.09.5
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: slya32xwjmklumhm23bt7xs6m
 Is Manager: false
 Node Address: 172.31.14.130
 Manager Addresses:
  172.31.13.2:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-1021-aws
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.945GiB
Name: ip-172-31-14-130
ID: X7FI:3VCW:OCVI:5XSX:HJ24:2NOD:NQYU:SEYL:JVIJ:J4DI:F5UL:NKZT
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: bizmd
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

据我所知,将 worker 添加到 swarm 并创建服务后应该没有任何问题。尽管如此,工作人员无法访问它已经托管的 nginx 服务。

是什么导致了这个问题?

我想检查我的工作服务器中实际打开了哪些端口(而不是仅在防火墙上打开了哪些端口)。

netstat -tulpn 告诉我:

(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::9443                 :::*                    LISTEN      -                   
tcp6       0      0 :::22                   :::*                    LISTEN      -                   
udp    19968      0 127.0.0.53:53           0.0.0.0:*                           -                   
udp        0      0 172.31.14.130:68        0.0.0.0:*                           -                   
udp        0      0 0.0.0.0:4789            0.0.0.0:*                           -

我注意到没有进程在使用 7946,这是需要打开的端口之一。所以我重新启动了docker服务:sudo service docker restart

重启完成后,我看到一个进程启动并占用了端口。果然,我然后能够对任一节点执行 curl localhost