Calico:Kubernetes pods 无法使用集群 IP 相互 ping 通
Calico: Kubernetes pods can't ping each other use Cluster IP
我使用 kubeadm v1.14.0 安装了 kubernetes,并通过 join 命令添加了两个工作节点。
kubeadm 配置
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.0
controlPlaneEndpoint: "172.22.203.12:6443"
networking:
provider.
podSubnet: "111.111.0.0/16"
节点列表
NAME STATUS ROLES AGE VERSION
linan Ready <none> 13h v1.14.0
node2 Ready <none> 13h v1.14.0
yiwu Ready master 13h v1.14.0
我检查了所有 pod 都在启动
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-h49t9 2/2 Running 1 13h
calico-node-mplwx 2/2 Running 0 13h
calico-node-twvsd 2/2 Running 0 13h
calico-typha-666749994b-d68qg 1/1 Running 0 13h
coredns-8567978547-dhbn4 1/1 Running 0 14h
coredns-8567978547-zv5w5 1/1 Running 0 14h
etcd-yiwu 1/1 Running 0 13h
kube-apiserver-yiwu 1/1 Running 0 13h
kube-controller-manager-yiwu 1/1 Running 0 13h
kube-proxy-7pjcx 1/1 Running 0 13h
kube-proxy-96d2j 1/1 Running 0 13h
kube-proxy-j5cnw 1/1 Running 0 14h
kube-scheduler-yiwu 1/1 Running 0 13h
这是我用来测试可用性的两个pods。
kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ds-2br6j 1/1 Running 0 13h 111.111.1.2 linan <none> <none>
nginx-ds-t7sfv 1/1 Running 0 13h 111.111.2.2 node2 <none> <none>
但我无法从任何节点(包括主节点)ping pod id 或访问 pod 和 pod 提供的服务。
[root@YiWu ~]# ping 111.111.1.2
PING 111.111.1.2 (111.111.1.2) 56(84) bytes of data.
^C
--- 111.111.1.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
[root@YiWu ~]# ping 111.111.2.2
PING 111.111.2.2 (111.111.2.2) 56(84) bytes of data.
^C
--- 111.111.2.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
Each node can only access pods on their own host
我检查了节点 calico 节点日志,这个日志出现在一些节点上,有些节点上没有。
义乌
bird: BGP: Unexpected connect from unknown address 172.19.0.1 (port 56754)
bird: BGP: Unexpected connect from unknown address 172.19.0.1 (port 40364)
节点2
bird: BGP: Unexpected connect from unknown address 172.22.203.11 (port 57996)
bird: BGP: Unexpected connect from unknown address 172.22.203.11 (port 59485)
临安
no
我在义乌节点安装calicoctl查看节点状态
DATASTORE_TYPE=kubernetes KUBECONFIG=~/.kube/config calicoctl get node -owide
NAME ASN IPV4 IPV6
linan (unknown) 172.18.0.1/16
node2 (unknown) 172.20.0.1/16
yiwu (unknown) 172.19.0.1/16
DATASTORE_TYPE=kubernetes KUBECONFIG=~/.kube/config calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+--------------------------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+--------------------------------+
| 172.18.0.1 | node-to-node mesh | start | 12:23:15 | Connect |
| 172.20.0.1 | node-to-node mesh | start | 12:23:18 | OpenSent Socket: Connection |
| | | | | closed |
+--------------+-------------------+-------+----------+--------------------------------+
IPv6 BGP status
No IPv6 peers found.
编辑
sysctl -p /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
vm.overcommit_memory = 1
vm.panic_on_oom = 0
fs.inotify.max_user_watches = 89100
已经设置所有节点的ip转发
我重新启动了 calico 并检查了它的日志
kubectl delete-f /etc/kubernetes/addons/calico.yaml
kubectl apply -f /etc/kubernetes/addons/calico.yaml
kubectl get pods -n kube-system
kubectl log calico-node-dp69k -c calico-node -n kube-system
calico-node-dp69k
is calico node name
Check out the calico log and found a strange network card as the boot NIC. like below
2019-08-15 04:39:10.859 [INFO][8] startup.go 564: Using autodetected IPv4 address on interface br-b733428777f6: 172.19.0.1/16
显然 br-b733428777f6
不是我所期望的
我检查了 calico configuration doc 关于
IP_AUTODETECTION_METHOD
默认 calico 将使用 first-found
模式到 select 网络接口
The first-found option enumerates all interface IP addresses and returns the first valid IP address (based on IP version and type of address) on the first valid interface.
就我而言,can-reach
更适合我
所以我编辑 calico.yaml
,然后像这样添加 IP_AUTODETECTION_METHOD
:
spec:
hostNetwork: true
serviceAccountName: calico-node
terminationGracePeriodSeconds: 0
containers:
- name: calico-node
image: quay.io/calico/node:v3.1.3
env:
- name: IP_AUTODETECTION_METHOD
value: can-reach=172.22.203.1
172.22.203.1
的can-reach=172.22.203.1
为网关ip,则
kubectl delete-f /etc/kubernetes/addons/calico.yaml
kubectl apply -f /etc/kubernetes/addons/calico.yaml
查看日志:
2019-08-15 04:50:27.942 [INFO][10] reachaddr.go 46: Auto-detected address by connecting to remote Destination="172.22.203.1" IP=172.22.203.10
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="cali7b8c9bd2e1f"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="veth24c7125"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="br-0b07d34c53b5"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.18.0.1/16"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="tunl0"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="111.111.1.1/32"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="docker0"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.17.0.1/16"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="enp0s20u1u5"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno4"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno3"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno2"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno1"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.22.203.10/24"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 59: Found matching interface CIDR CIDR="172.22.203.10/24"
2019-08-15 04:50:27.943 [INFO][10] startup.go 590: Using autodetected IPv4 address 172.22.203.10/24, detected by connecting to 172.22.203.1
哇,它选择了正确的开发接口
去检查pod IP是否可以访问,可以访问!
完成
致未来的 Google 员工。就我而言,
我使用了运算符:
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16 # your pod cidr
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
nodeAddressAutodetectionV4:
interface: ens* # Change this one to fix the autodetected issue. My interface is ensxxx
如果它不起作用,那是因为您之前可能安装了 flannel、cilium 等,
您需要先删除网络接口。
ip link
对于 flannel 的每个界面,执行以下操作
ifconfig <name of interface from ip link> down
ip link delete <name of interface from ip link>
我使用 kubeadm v1.14.0 安装了 kubernetes,并通过 join 命令添加了两个工作节点。 kubeadm 配置
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.0
controlPlaneEndpoint: "172.22.203.12:6443"
networking:
provider.
podSubnet: "111.111.0.0/16"
节点列表
NAME STATUS ROLES AGE VERSION
linan Ready <none> 13h v1.14.0
node2 Ready <none> 13h v1.14.0
yiwu Ready master 13h v1.14.0
我检查了所有 pod 都在启动
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-h49t9 2/2 Running 1 13h
calico-node-mplwx 2/2 Running 0 13h
calico-node-twvsd 2/2 Running 0 13h
calico-typha-666749994b-d68qg 1/1 Running 0 13h
coredns-8567978547-dhbn4 1/1 Running 0 14h
coredns-8567978547-zv5w5 1/1 Running 0 14h
etcd-yiwu 1/1 Running 0 13h
kube-apiserver-yiwu 1/1 Running 0 13h
kube-controller-manager-yiwu 1/1 Running 0 13h
kube-proxy-7pjcx 1/1 Running 0 13h
kube-proxy-96d2j 1/1 Running 0 13h
kube-proxy-j5cnw 1/1 Running 0 14h
kube-scheduler-yiwu 1/1 Running 0 13h
这是我用来测试可用性的两个pods。
kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ds-2br6j 1/1 Running 0 13h 111.111.1.2 linan <none> <none>
nginx-ds-t7sfv 1/1 Running 0 13h 111.111.2.2 node2 <none> <none>
但我无法从任何节点(包括主节点)ping pod id 或访问 pod 和 pod 提供的服务。
[root@YiWu ~]# ping 111.111.1.2
PING 111.111.1.2 (111.111.1.2) 56(84) bytes of data.
^C
--- 111.111.1.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
[root@YiWu ~]# ping 111.111.2.2
PING 111.111.2.2 (111.111.2.2) 56(84) bytes of data.
^C
--- 111.111.2.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
Each node can only access pods on their own host
我检查了节点 calico 节点日志,这个日志出现在一些节点上,有些节点上没有。
义乌
bird: BGP: Unexpected connect from unknown address 172.19.0.1 (port 56754)
bird: BGP: Unexpected connect from unknown address 172.19.0.1 (port 40364)
节点2
bird: BGP: Unexpected connect from unknown address 172.22.203.11 (port 57996)
bird: BGP: Unexpected connect from unknown address 172.22.203.11 (port 59485)
临安
no
我在义乌节点安装calicoctl查看节点状态
DATASTORE_TYPE=kubernetes KUBECONFIG=~/.kube/config calicoctl get node -owide
NAME ASN IPV4 IPV6
linan (unknown) 172.18.0.1/16
node2 (unknown) 172.20.0.1/16
yiwu (unknown) 172.19.0.1/16
DATASTORE_TYPE=kubernetes KUBECONFIG=~/.kube/config calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+--------------------------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+--------------------------------+
| 172.18.0.1 | node-to-node mesh | start | 12:23:15 | Connect |
| 172.20.0.1 | node-to-node mesh | start | 12:23:18 | OpenSent Socket: Connection |
| | | | | closed |
+--------------+-------------------+-------+----------+--------------------------------+
IPv6 BGP status
No IPv6 peers found.
编辑
sysctl -p /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
vm.overcommit_memory = 1
vm.panic_on_oom = 0
fs.inotify.max_user_watches = 89100
已经设置所有节点的ip转发
我重新启动了 calico 并检查了它的日志
kubectl delete-f /etc/kubernetes/addons/calico.yaml
kubectl apply -f /etc/kubernetes/addons/calico.yaml
kubectl get pods -n kube-system
kubectl log calico-node-dp69k -c calico-node -n kube-system
calico-node-dp69k
is calico node name Check out the calico log and found a strange network card as the boot NIC. like below
2019-08-15 04:39:10.859 [INFO][8] startup.go 564: Using autodetected IPv4 address on interface br-b733428777f6: 172.19.0.1/16
显然 br-b733428777f6
不是我所期望的
我检查了 calico configuration doc 关于
IP_AUTODETECTION_METHOD
默认 calico 将使用 first-found
模式到 select 网络接口
The first-found option enumerates all interface IP addresses and returns the first valid IP address (based on IP version and type of address) on the first valid interface.
就我而言,can-reach
更适合我
所以我编辑 calico.yaml
,然后像这样添加 IP_AUTODETECTION_METHOD
:
spec:
hostNetwork: true
serviceAccountName: calico-node
terminationGracePeriodSeconds: 0
containers:
- name: calico-node
image: quay.io/calico/node:v3.1.3
env:
- name: IP_AUTODETECTION_METHOD
value: can-reach=172.22.203.1
172.22.203.1
的can-reach=172.22.203.1
为网关ip,则
kubectl delete-f /etc/kubernetes/addons/calico.yaml
kubectl apply -f /etc/kubernetes/addons/calico.yaml
查看日志:
2019-08-15 04:50:27.942 [INFO][10] reachaddr.go 46: Auto-detected address by connecting to remote Destination="172.22.203.1" IP=172.22.203.10
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="cali7b8c9bd2e1f"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="veth24c7125"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="br-0b07d34c53b5"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.18.0.1/16"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="tunl0"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="111.111.1.1/32"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="docker0"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.17.0.1/16"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="enp0s20u1u5"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno4"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno3"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno2"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 55: Checking interface CIDRs Name="eno1"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 57: Checking CIDR CIDR="172.22.203.10/24"
2019-08-15 04:50:27.943 [INFO][10] reachaddr.go 59: Found matching interface CIDR CIDR="172.22.203.10/24"
2019-08-15 04:50:27.943 [INFO][10] startup.go 590: Using autodetected IPv4 address 172.22.203.10/24, detected by connecting to 172.22.203.1
哇,它选择了正确的开发接口
去检查pod IP是否可以访问,可以访问!
完成
致未来的 Google 员工。就我而言,
我使用了运算符:
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16 # your pod cidr
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
nodeAddressAutodetectionV4:
interface: ens* # Change this one to fix the autodetected issue. My interface is ensxxx
如果它不起作用,那是因为您之前可能安装了 flannel、cilium 等,
您需要先删除网络接口。
ip link
对于 flannel 的每个界面,执行以下操作
ifconfig <name of interface from ip link> down
ip link delete <name of interface from ip link>