运行 hybrid/heterogeneous Kubernetes 集群,节点 运行 在不同的网络中使用 VPN
Running a hybrid/heterogeneous Kubernetes cluster with nodes running in different networks using a VPN
我的目标是建立一个 hybrid/heterogeneous Kubernetes 集群模型,其中我有以下设置:
- AWS(云)上的主节点 运行s - ip-172-31-28-6
- 我笔记本电脑上的从属节点 运行s - osboxes
- Raspberry Pi 上的从节点 运行s - edge-1
运行 在我的笔记本电脑上本地具有三个 VM 的 Kubernetes 集群没有问题,并且可以与 Weave Net 一起正常工作。但是,在如上所述对我的 Kubernetes 集群进行建模时,存在一些通信问题(我猜)。
由于 Kubernetes 被设计为 运行 在节点上,因此所有节点都位于同一网络中,我在 AWS 上设置了一个 OpenVPN 服务器并连接了我的笔记本电脑和 Raspberry Pi 以它。当从属节点位于不同的网络中时,我希望这足以 运行 异构设置中的 Kubernetes。当然,这是一个错误的假设。
如果我 运行 从属节点上的 Kubernetes 仪表板并尝试访问它,我会超时。如果我 运行 它在主节点上,一切都按预期工作。
我使用 kubeadm init --apiserver-advertise-address= 在 AWS 上设置集群,并使用 kubeadm join 连接节点。
$ kubectl get pods --all-namespaces -o wide:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-apiserver-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-controller-manager-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-dns-6f4fd4bdf-w6ctf 0/3 ContainerCreating 0 15h <none> osboxes
kube-system kube-proxy-2pl2f 1/1 Running 0 15h 172.31.28.6 ip-172-31-28-6
kube-system kube-proxy-7b89c 0/1 CrashLoopBackOff 15 15h 192.168.2.106 edge-1
kube-system kube-proxy-qg69g 1/1 Running 1 15h 10.0.2.15 osboxes
kube-system kube-scheduler-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system weave-net-pqxfp 1/2 CrashLoopBackOff 189 15h 172.31.28.6 ip-172-31-28-6
kube-system weave-net-thhzr 1/2 CrashLoopBackOff 12 36m 192.168.2.106 edge-1
kube-system weave-net-v69hj 2/2 Running 7 15h 10.0.2.15 osboxes
$ kubectl -n kube-system 日志 --v=7 kube-dns-6f4fd4bdf-w6ctf -c kubedns:
...
I0321 09:04:25.620580 23936 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/kube-dns-6f4fd4bdf-w6ctf/log?container=kubedns
I0321 09:04:25.620605 23936 round_trippers.go:421] Request Headers:
I0321 09:04:25.620611 23936 round_trippers.go:424] Accept: application/json, */*
I0321 09:04:25.620616 23936 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:04:25.713821 23936 round_trippers.go:439] Response Status: 400 Bad Request in 93 milliseconds
I0321 09:04:25.714106 23936 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "container \"kubedns\" in pod \"kube-dns-6f4fd4bdf-w6ctf\" is waiting to start: ContainerCreating",
"reason": "BadRequest",
"code": 400
}]
F0321 09:04:25.714134 23936 helpers.go:119] Error from server (BadRequest): container "kubedns" in pod "kube-dns-6f4fd4bdf-w6ctf" is waiting to start: ContainerCreating
kubectl -n kube-system 日志 --v=7 kube-proxy-7b89c:
...
I0321 09:06:51.803852 24289 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/kube-proxy-7b89c/log
I0321 09:06:51.803879 24289 round_trippers.go:421] Request Headers:
I0321 09:06:51.803891 24289 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:06:51.803900 24289 round_trippers.go:424] Accept: application/json, */*
I0321 09:08:59.110869 24289 round_trippers.go:439] Response Status: 500 Internal Server Error in 127306 milliseconds
I0321 09:08:59.111129 24289 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out",
"code": 500
}]
F0321 09:08:59.111156 24289 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
kubectl -n kube-system logs --v=7 weave-net-pqxfp -c weave:
...
I0321 09:12:08.047206 24847 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/weave-net-pqxfp/log?container=weave
I0321 09:12:08.047233 24847 round_trippers.go:421] Request Headers:
I0321 09:12:08.047335 24847 round_trippers.go:424] Accept: application/json, */*
I0321 09:12:08.047347 24847 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:12:08.062494 24847 round_trippers.go:439] Response Status: 200 OK in 15 milliseconds
DEBU: 2018/03/21 09:11:26.847013 [kube-peers] Checking peer "fa:10:a4:97:7e:7b" against list &{[{6e:fd:f4:ef:1e:f5 osboxes}]}
Peer not in list; removing persisted data
INFO: 2018/03/21 09:11:26.880946 Command line options: map[expect-npc:true ipalloc-init:consensus=3 db-prefix:/weavedb/weave-net http-addr:127.0.0.1:6784 ipalloc-range:10.32.0.0/12 nickname:ip-172-31-28-6 host-root:/host name:fa:10:a4:97:7e:7b no-dns:true status-addr:0.0.0.0:6782 datapath:datapath docker-api: port:6783 conn-limit:30]
INFO: 2018/03/21 09:11:26.880995 weave 2.2.1
FATA: 2018/03/21 09:11:26.881117 Inconsistent bridge state detected. Please do 'weave reset' and try again
kubectl -n kube-system logs --v=7 weave-net-thhzr -c weave:
...
I0321 09:15:13.787905 25113 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/weave-net-thhzr/log?container=weave
I0321 09:15:13.787932 25113 round_trippers.go:421] Request Headers:
I0321 09:15:13.787938 25113 round_trippers.go:424] Accept: application/json, */*
I0321 09:15:13.787946 25113 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:17:21.126863 25113 round_trippers.go:439] Response Status: 500 Internal Server Error in 127338 milliseconds
I0321 09:17:21.127140 25113 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get https://192.168.2.106:10250/containerLogs/kube-system/weave-net-thhzr/weave: dial tcp 192.168.2.106:10250: getsockopt: connection timed out",
"code": 500
}]
F0321 09:17:21.127167 25113 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/weave-net-thhzr/weave: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
$ ifconfig(AWS 上的 Kubernetes 大师):
datapath Link encap:Ethernet HWaddr ae:90:9a:b2:7e:d9
inet6 addr: fe80::ac90:9aff:feb2:7ed9/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:1904 (1.9 KB) TX bytes:1188 (1.1 KB)
docker0 Link encap:Ethernet HWaddr 02:42:50:39:1f:c7
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr 06:a3:d0:8e:19:72
inet addr:172.31.28.6 Bcast:172.31.31.255 Mask:255.255.240.0
inet6 addr: fe80::4a3:d0ff:fe8e:1972/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:10323322 errors:0 dropped:0 overruns:0 frame:0
TX packets:9418208 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3652314289 (3.6 GB) TX bytes:3117288442 (3.1 GB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:11388236 errors:0 dropped:0 overruns:0 frame:0
TX packets:11388236 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:2687297929 (2.6 GB) TX bytes:2687297929 (2.6 GB)
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.8.0.1 P-t-P:10.8.0.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:97222 errors:0 dropped:0 overruns:0 frame:0
TX packets:164607 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:13381022 (13.3 MB) TX bytes:209129403 (209.1 MB)
vethwe-bridge Link encap:Ethernet HWaddr 12:59:54:73:0f:91
inet6 addr: fe80::1059:54ff:fe73:f91/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:18 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1476 (1.4 KB) TX bytes:2940 (2.9 KB)
vethwe-datapath Link encap:Ethernet HWaddr 8e:75:1c:92:93:0d
inet6 addr: fe80::8c75:1cff:fe92:930d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2940 (2.9 KB) TX bytes:1476 (1.4 KB)
vxlan-6784 Link encap:Ethernet HWaddr a6:02:da:5e:d5:2a
inet6 addr: fe80::a402:daff:fe5e:d52a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65485 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:8 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
$ sudo systemctl status kubelet.service(在 AWS 上):
Mar 21 09:34:59 ip-172-31-28-6 kubelet[19676]: W0321 09:34:59.202058 19676 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 21 09:34:59 ip-172-31-28-6 kubelet[19676]: E0321 09:34:59.202452 19676 kubelet.go:2109] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.535541 19676 kuberuntime_manager.go:514] Container {Name:weave Image:weaveworks/weave-kube:2.2.1 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:HOSTNAME Value: ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:spec.nodeName,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,}}] Resources:{Limits:map[] Requests:map[cpu:{i:{value:10 scale:-3} d:{Dec:<nil>} s:10m Format:DecimalSI}]} VolumeMounts:[{Name:weavedb ReadOnly:false MountPath:/weavedb SubPath: MountPropagation:<nil>} {Name:cni-bin ReadOnly:false MountPath:/host/opt SubPath: MountPropagation:<nil>} {Name:cni-bin2 ReadOnly:false MountPath:/host/home SubPath: MountPropagation:<nil>} {Name:cni-conf ReadOnly:false MountPath:/host/etc SubPath: MountPropagation:<nil>} {Name:dbus ReadOnly:false MountPath:/host/var/lib/dbus SubPath: MountPropagation:<nil>} {Name:lib-modules ReadOnly:false MountPath:/lib/modules SubPath: MountPropagation:<nil>} {Name:weave-net-token-vn8rh ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/status,Port:6784,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:30,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.536504 19676 kuberuntime_manager.go:758] checking backoff for container "weave" in pod "weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.536636 19676 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=weave pod=weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: E0321 09:35:01.536664 19676 pod_workers.go:186] Error syncing pod c6450070-2c61-11e8-a50d-06a3d08e1972 ("weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "StartContainer" for "weave" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=weave pod=weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"
$ sudo systemctl status kubelet.service(在笔记本电脑上)
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.662670 715 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.663412 715 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.663869 715 kuberuntime_manager.go:647] createPodSandbox for pod "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.664295 715 pod_workers.go:186] Error syncing pod 11886465-2c61-11e8-a50d-06a3d08e1972 ("kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "CreatePodSandbox" for "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)\" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Mar 21 05:47:20 osboxes kubelet[715]: W0321 05:47:20.536161 715 pod_container_deletor.go:77] Container "bbf490835face43b70c24dbcb67c3f75872e7831b5e2605dc8bb71210910e273" not found in pod's containers
$ sudo systemctl status kubelet.service(在 Raspberry Pi 上):
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.188199 339 kuberuntime_manager.go:514] Container {Name:kube-proxy Image:gcr.io/google_containers/kube-proxy-amd64:v1.9.5 Command:[/usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:kube-proxy ReadOnly:false MountPath:/var/lib/kube-proxy SubPath: MountPropagation:<nil>} {Name:xtables-lock ReadOnly:false MountPath:/run/xtables.lock SubPath: MountPropagation:<nil>} {Name:lib-modules ReadOnly:true MountPath:/lib/modules SubPath: MountPropagation:<nil>} {Name:kube-proxy-token-px7dt ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.189023 339 kuberuntime_manager.go:758] checking backoff for container "kube-proxy" in pod "kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.190174 339 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-proxy pod=kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)
Mar 21 09:29:01 edge-1 kubelet[339]: E0321 09:29:01.190518 339 pod_workers.go:186] Error syncing pod 5bebafa1-2c61-11e8-a50d-06a3d08e1972 ("kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "StartContainer" for "kube-proxy" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-proxy pod=kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:29:02 edge-1 kubelet[339]: W0321 09:29:02.278342 339 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 21 09:29:02 edge-1 kubelet[339]: E0321 09:29:02.282534 339 kubelet.go:2120] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
你的 Kubernetes 主节点和节点之间的网络肯定有问题。
但是,首先,创建这种混合安装并不是一个好主意。 master(s) 和节点之间必须有稳定的网络,否则会导致很多问题。但这很难通过互联网连接实现。
如果您想准备混合安装,您可以在 AWS 中的 Kubernetes 集群和本地硬件之间使用 Federation。
但是,考虑到您的问题,我发现您在 Master 和 edge-1
节点上的 Weave 网络存在问题。从日志中不清楚您遇到的是哪种问题,请尝试使用 WEAVE_DEBUG=1
环境变量 运行 编织容器。如果没有联网,其他 pods 如 kube-dns
将无法正常工作。
另外,您是如何设置 OpenVPN 的。您必须在 AWS 子网和 client-to-client 之间进行路由。因此,您用于在所有节点上设置集群的所有地址都必须在彼此之间路由。再检查一次您将 Kubernetes 组件和 Weave 绑定到哪个地址,该地址是否可路由。
- 此消息解释了其中一次崩溃:
FATA: 2018/03/21 09:11:26.881117 Inconsistent bridge state detected. Please do 'weave reset' and try again
由于在 Kubernetes 节点上 运行 weave
命令有点复杂,只需重新启动节点并从头开始重新创建网桥。
- 此消息表示无法联系节点以获取日志:
F0321 09:08:59.111156 24289 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
考虑这些主机是否可以通过其常规网络相互访问。
我的目标是建立一个 hybrid/heterogeneous Kubernetes 集群模型,其中我有以下设置:
- AWS(云)上的主节点 运行s - ip-172-31-28-6
- 我笔记本电脑上的从属节点 运行s - osboxes
- Raspberry Pi 上的从节点 运行s - edge-1
运行 在我的笔记本电脑上本地具有三个 VM 的 Kubernetes 集群没有问题,并且可以与 Weave Net 一起正常工作。但是,在如上所述对我的 Kubernetes 集群进行建模时,存在一些通信问题(我猜)。
由于 Kubernetes 被设计为 运行 在节点上,因此所有节点都位于同一网络中,我在 AWS 上设置了一个 OpenVPN 服务器并连接了我的笔记本电脑和 Raspberry Pi 以它。当从属节点位于不同的网络中时,我希望这足以 运行 异构设置中的 Kubernetes。当然,这是一个错误的假设。
如果我 运行 从属节点上的 Kubernetes 仪表板并尝试访问它,我会超时。如果我 运行 它在主节点上,一切都按预期工作。
我使用 kubeadm init --apiserver-advertise-address= 在 AWS 上设置集群,并使用 kubeadm join 连接节点。
$ kubectl get pods --all-namespaces -o wide:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-apiserver-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-controller-manager-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system kube-dns-6f4fd4bdf-w6ctf 0/3 ContainerCreating 0 15h <none> osboxes
kube-system kube-proxy-2pl2f 1/1 Running 0 15h 172.31.28.6 ip-172-31-28-6
kube-system kube-proxy-7b89c 0/1 CrashLoopBackOff 15 15h 192.168.2.106 edge-1
kube-system kube-proxy-qg69g 1/1 Running 1 15h 10.0.2.15 osboxes
kube-system kube-scheduler-ip-172-31-28-6 1/1 Running 0 5m 172.31.28.6 ip-172-31-28-6
kube-system weave-net-pqxfp 1/2 CrashLoopBackOff 189 15h 172.31.28.6 ip-172-31-28-6
kube-system weave-net-thhzr 1/2 CrashLoopBackOff 12 36m 192.168.2.106 edge-1
kube-system weave-net-v69hj 2/2 Running 7 15h 10.0.2.15 osboxes
$ kubectl -n kube-system 日志 --v=7 kube-dns-6f4fd4bdf-w6ctf -c kubedns:
...
I0321 09:04:25.620580 23936 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/kube-dns-6f4fd4bdf-w6ctf/log?container=kubedns
I0321 09:04:25.620605 23936 round_trippers.go:421] Request Headers:
I0321 09:04:25.620611 23936 round_trippers.go:424] Accept: application/json, */*
I0321 09:04:25.620616 23936 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:04:25.713821 23936 round_trippers.go:439] Response Status: 400 Bad Request in 93 milliseconds
I0321 09:04:25.714106 23936 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "container \"kubedns\" in pod \"kube-dns-6f4fd4bdf-w6ctf\" is waiting to start: ContainerCreating",
"reason": "BadRequest",
"code": 400
}]
F0321 09:04:25.714134 23936 helpers.go:119] Error from server (BadRequest): container "kubedns" in pod "kube-dns-6f4fd4bdf-w6ctf" is waiting to start: ContainerCreating
kubectl -n kube-system 日志 --v=7 kube-proxy-7b89c:
...
I0321 09:06:51.803852 24289 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/kube-proxy-7b89c/log
I0321 09:06:51.803879 24289 round_trippers.go:421] Request Headers:
I0321 09:06:51.803891 24289 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:06:51.803900 24289 round_trippers.go:424] Accept: application/json, */*
I0321 09:08:59.110869 24289 round_trippers.go:439] Response Status: 500 Internal Server Error in 127306 milliseconds
I0321 09:08:59.111129 24289 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out",
"code": 500
}]
F0321 09:08:59.111156 24289 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
kubectl -n kube-system logs --v=7 weave-net-pqxfp -c weave:
...
I0321 09:12:08.047206 24847 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/weave-net-pqxfp/log?container=weave
I0321 09:12:08.047233 24847 round_trippers.go:421] Request Headers:
I0321 09:12:08.047335 24847 round_trippers.go:424] Accept: application/json, */*
I0321 09:12:08.047347 24847 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:12:08.062494 24847 round_trippers.go:439] Response Status: 200 OK in 15 milliseconds
DEBU: 2018/03/21 09:11:26.847013 [kube-peers] Checking peer "fa:10:a4:97:7e:7b" against list &{[{6e:fd:f4:ef:1e:f5 osboxes}]}
Peer not in list; removing persisted data
INFO: 2018/03/21 09:11:26.880946 Command line options: map[expect-npc:true ipalloc-init:consensus=3 db-prefix:/weavedb/weave-net http-addr:127.0.0.1:6784 ipalloc-range:10.32.0.0/12 nickname:ip-172-31-28-6 host-root:/host name:fa:10:a4:97:7e:7b no-dns:true status-addr:0.0.0.0:6782 datapath:datapath docker-api: port:6783 conn-limit:30]
INFO: 2018/03/21 09:11:26.880995 weave 2.2.1
FATA: 2018/03/21 09:11:26.881117 Inconsistent bridge state detected. Please do 'weave reset' and try again
kubectl -n kube-system logs --v=7 weave-net-thhzr -c weave:
...
I0321 09:15:13.787905 25113 round_trippers.go:414] GET https://<PUBLIC_IP>:6443/api/v1/namespaces/kube-system/pods/weave-net-thhzr/log?container=weave
I0321 09:15:13.787932 25113 round_trippers.go:421] Request Headers:
I0321 09:15:13.787938 25113 round_trippers.go:424] Accept: application/json, */*
I0321 09:15:13.787946 25113 round_trippers.go:424] User-Agent: kubectl/v1.9.4 (linux/amd64) kubernetes/bee2d15
I0321 09:17:21.126863 25113 round_trippers.go:439] Response Status: 500 Internal Server Error in 127338 milliseconds
I0321 09:17:21.127140 25113 helpers.go:201] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get https://192.168.2.106:10250/containerLogs/kube-system/weave-net-thhzr/weave: dial tcp 192.168.2.106:10250: getsockopt: connection timed out",
"code": 500
}]
F0321 09:17:21.127167 25113 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/weave-net-thhzr/weave: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
$ ifconfig(AWS 上的 Kubernetes 大师):
datapath Link encap:Ethernet HWaddr ae:90:9a:b2:7e:d9
inet6 addr: fe80::ac90:9aff:feb2:7ed9/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:29 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:1904 (1.9 KB) TX bytes:1188 (1.1 KB)
docker0 Link encap:Ethernet HWaddr 02:42:50:39:1f:c7
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
eth0 Link encap:Ethernet HWaddr 06:a3:d0:8e:19:72
inet addr:172.31.28.6 Bcast:172.31.31.255 Mask:255.255.240.0
inet6 addr: fe80::4a3:d0ff:fe8e:1972/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:10323322 errors:0 dropped:0 overruns:0 frame:0
TX packets:9418208 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3652314289 (3.6 GB) TX bytes:3117288442 (3.1 GB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:11388236 errors:0 dropped:0 overruns:0 frame:0
TX packets:11388236 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:2687297929 (2.6 GB) TX bytes:2687297929 (2.6 GB)
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.8.0.1 P-t-P:10.8.0.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:97222 errors:0 dropped:0 overruns:0 frame:0
TX packets:164607 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:13381022 (13.3 MB) TX bytes:209129403 (209.1 MB)
vethwe-bridge Link encap:Ethernet HWaddr 12:59:54:73:0f:91
inet6 addr: fe80::1059:54ff:fe73:f91/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:18 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1476 (1.4 KB) TX bytes:2940 (2.9 KB)
vethwe-datapath Link encap:Ethernet HWaddr 8e:75:1c:92:93:0d
inet6 addr: fe80::8c75:1cff:fe92:930d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1376 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2940 (2.9 KB) TX bytes:1476 (1.4 KB)
vxlan-6784 Link encap:Ethernet HWaddr a6:02:da:5e:d5:2a
inet6 addr: fe80::a402:daff:fe5e:d52a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65485 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:8 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
$ sudo systemctl status kubelet.service(在 AWS 上):
Mar 21 09:34:59 ip-172-31-28-6 kubelet[19676]: W0321 09:34:59.202058 19676 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 21 09:34:59 ip-172-31-28-6 kubelet[19676]: E0321 09:34:59.202452 19676 kubelet.go:2109] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.535541 19676 kuberuntime_manager.go:514] Container {Name:weave Image:weaveworks/weave-kube:2.2.1 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:HOSTNAME Value: ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:spec.nodeName,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,}}] Resources:{Limits:map[] Requests:map[cpu:{i:{value:10 scale:-3} d:{Dec:<nil>} s:10m Format:DecimalSI}]} VolumeMounts:[{Name:weavedb ReadOnly:false MountPath:/weavedb SubPath: MountPropagation:<nil>} {Name:cni-bin ReadOnly:false MountPath:/host/opt SubPath: MountPropagation:<nil>} {Name:cni-bin2 ReadOnly:false MountPath:/host/home SubPath: MountPropagation:<nil>} {Name:cni-conf ReadOnly:false MountPath:/host/etc SubPath: MountPropagation:<nil>} {Name:dbus ReadOnly:false MountPath:/host/var/lib/dbus SubPath: MountPropagation:<nil>} {Name:lib-modules ReadOnly:false MountPath:/lib/modules SubPath: MountPropagation:<nil>} {Name:weave-net-token-vn8rh ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/status,Port:6784,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:30,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.536504 19676 kuberuntime_manager.go:758] checking backoff for container "weave" in pod "weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: I0321 09:35:01.536636 19676 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=weave pod=weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)
Mar 21 09:35:01 ip-172-31-28-6 kubelet[19676]: E0321 09:35:01.536664 19676 pod_workers.go:186] Error syncing pod c6450070-2c61-11e8-a50d-06a3d08e1972 ("weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "StartContainer" for "weave" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=weave pod=weave-net-pqxfp_kube-system(c6450070-2c61-11e8-a50d-06a3d08e1972)"
$ sudo systemctl status kubelet.service(在笔记本电脑上)
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.662670 715 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.663412 715 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.663869 715 kuberuntime_manager.go:647] createPodSandbox for pod "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Mar 21 05:47:18 osboxes kubelet[715]: E0321 05:47:18.664295 715 pod_workers.go:186] Error syncing pod 11886465-2c61-11e8-a50d-06a3d08e1972 ("kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "CreatePodSandbox" for "kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-dns-6f4fd4bdf-w6ctf_kube-system(11886465-2c61-11e8-a50d-06a3d08e1972)\" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Mar 21 05:47:20 osboxes kubelet[715]: W0321 05:47:20.536161 715 pod_container_deletor.go:77] Container "bbf490835face43b70c24dbcb67c3f75872e7831b5e2605dc8bb71210910e273" not found in pod's containers
$ sudo systemctl status kubelet.service(在 Raspberry Pi 上):
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.188199 339 kuberuntime_manager.go:514] Container {Name:kube-proxy Image:gcr.io/google_containers/kube-proxy-amd64:v1.9.5 Command:[/usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:kube-proxy ReadOnly:false MountPath:/var/lib/kube-proxy SubPath: MountPropagation:<nil>} {Name:xtables-lock ReadOnly:false MountPath:/run/xtables.lock SubPath: MountPropagation:<nil>} {Name:lib-modules ReadOnly:true MountPath:/lib/modules SubPath: MountPropagation:<nil>} {Name:kube-proxy-token-px7dt ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.189023 339 kuberuntime_manager.go:758] checking backoff for container "kube-proxy" in pod "kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:29:01 edge-1 kubelet[339]: I0321 09:29:01.190174 339 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-proxy pod=kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)
Mar 21 09:29:01 edge-1 kubelet[339]: E0321 09:29:01.190518 339 pod_workers.go:186] Error syncing pod 5bebafa1-2c61-11e8-a50d-06a3d08e1972 ("kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"), skipping: failed to "StartContainer" for "kube-proxy" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-proxy pod=kube-proxy-7b89c_kube-system(5bebafa1-2c61-11e8-a50d-06a3d08e1972)"
Mar 21 09:29:02 edge-1 kubelet[339]: W0321 09:29:02.278342 339 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 21 09:29:02 edge-1 kubelet[339]: E0321 09:29:02.282534 339 kubelet.go:2120] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
你的 Kubernetes 主节点和节点之间的网络肯定有问题。
但是,首先,创建这种混合安装并不是一个好主意。 master(s) 和节点之间必须有稳定的网络,否则会导致很多问题。但这很难通过互联网连接实现。
如果您想准备混合安装,您可以在 AWS 中的 Kubernetes 集群和本地硬件之间使用 Federation。
但是,考虑到您的问题,我发现您在 Master 和 edge-1
节点上的 Weave 网络存在问题。从日志中不清楚您遇到的是哪种问题,请尝试使用 WEAVE_DEBUG=1
环境变量 运行 编织容器。如果没有联网,其他 pods 如 kube-dns
将无法正常工作。
另外,您是如何设置 OpenVPN 的。您必须在 AWS 子网和 client-to-client 之间进行路由。因此,您用于在所有节点上设置集群的所有地址都必须在彼此之间路由。再检查一次您将 Kubernetes 组件和 Weave 绑定到哪个地址,该地址是否可路由。
- 此消息解释了其中一次崩溃:
FATA: 2018/03/21 09:11:26.881117 Inconsistent bridge state detected. Please do 'weave reset' and try again
由于在 Kubernetes 节点上 运行 weave
命令有点复杂,只需重新启动节点并从头开始重新创建网桥。
- 此消息表示无法联系节点以获取日志:
F0321 09:08:59.111156 24289 helpers.go:119] Error from server: Get https://192.168.2.106:10250/containerLogs/kube-system/kube-proxy-7b89c/kube-proxy: dial tcp 192.168.2.106:10250: getsockopt: connection timed out
考虑这些主机是否可以通过其常规网络相互访问。