k3s - pods 之间的网络不工作

k3s - networking between pods not working

尽管为它们设置了 clusterIP 服务,但我仍在为 pods 之间的这种交叉通信而苦苦挣扎。所有 pods 都在同一个主节点上,并且在同一个命名空间中。总结:

$ kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE    IP           NODE          NOMINATED NODE   READINESS GATES
nginx-744f4df6df-rxhph       1/1     Running   0          136m   10.42.0.31   raspberrypi   <none>           <none>
nginx-2-867f4f8859-csn48     1/1     Running   0          134m   10.42.0.32   raspberrypi   <none>           <none>

$ kubectl get svc -o wide
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE    SELECTOR
nginx-service    ClusterIP   10.43.155.201   <none>        80/TCP                       136m   app=nginx
nginx-service2   ClusterIP   10.43.182.138   <none>        85/TCP                       134m   app=nginx-2

我无法从 nginx 容器中 curl http://nginx-service2:85 ,反之亦然...虽然我从我的 docker 桌面安装:

# docker desktop
root@nginx-7dc45fbd74-7prml:/# curl http://nginx-service2:85
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# k3s
root@nginx-744f4df6df-rxhph:/# curl http://nginx-service2.pwk3spi-vraptor:85
curl: (6) Could not resolve host: nginx-service2.pwk3spi-vraptor

在谷歌搜索问题后(如果我错了请纠正我)这似乎是一个 coredns 问题,因为查看日志并看到错误超时:

$ kubectl get pods -n kube-system
NAME                                     READY   STATUS      RESTARTS   AGE
helm-install-traefik-qr2bd               0/1     Completed   0          153d
metrics-server-7566d596c8-nnzg2          1/1     Running     59         148d
svclb-traefik-kjbbr                      2/2     Running     60         153d
traefik-758cd5fc85-wzjrn                 1/1     Running     20         62d
local-path-provisioner-6d59f47c7-4hvf2   1/1     Running     72         148d
coredns-7944c66d8d-gkdp4                 1/1     Running     0          3m47s

$ kubectl logs coredns-7944c66d8d-gkdp4 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 1c648f07b77ab1530deca4234afe0d03
CoreDNS-1.6.9
linux/arm, go1.14.1, 1766568
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:50482->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:34160->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:53485->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:46642->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:55329->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:44471->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:49182->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:54082->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48151->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48599->192.168.8.109:53: i/o timeout

人们推荐的地方

... other CoreFile stuff

forward . host server IP

... other CoreFile stuff

search default.svc.cluster.local svc.cluster.local cluster.local

nameserver 10.42.0.38

nameserver 192.168.8.1

nameserver fe80::266:19ff:fea7:85e7%wlan0

,但是没有发现这些解决方案有效。

详情供参考:

$ kubectl get nodes -o wide
NAME          STATUS   ROLES    AGE    VERSION        INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
raspberrypi   Ready    master   153d   v1.18.9+k3s1   192.168.8.109   <none>        Raspbian GNU/Linux 10 (buster)   5.10.9-v7l+      containerd://1.3.3-k3s2

$ kubectl get svc -n kube-system -o wide
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE    SELECTOR
kube-dns             ClusterIP      10.43.0.10      <none>          53/UDP,53/TCP,9153/TCP       153d   k8s-app=kube-dns
metrics-server       ClusterIP      10.43.205.8     <none>          443/TCP                      153d   k8s-app=metrics-server
traefik-prometheus   ClusterIP      10.43.222.138   <none>          9100/TCP                     153d   app=traefik,release=traefik
traefik              LoadBalancer   10.43.249.133   192.168.8.109   80:31222/TCP,443:32509/TCP   153d   app=traefik,release=traefik

$ kubectl get ep kube-dns -n kube-system
NAME       ENDPOINTS                                     AGE
kube-dns   10.42.0.38:53,10.42.0.38:9153,10.42.0.38:53   153d

不知道我哪里出错了,或者我是否专注于错误的事情,或者如何继续。如有任何帮助,我们将不胜感激。

您尝试卷曲此地址是否有原因:

curl http://nginx-service2.pwk3spi-vraptor:85

这不应该只是:

curl http://nginx-service2:85

当所有其他方法都失败时......返回手册。我试图在所有错误的地方找到 'issue',而我只需要按照 Rancher 的 k3s 安装文档(叹气)。

Rancher's documentation 非常好(你只需要真正遵循它),他们说在 Raspbian Buster 环境 [= 上安装 k3s 时16=]

check version:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 10 (buster)
Release:        10
Codename:       buster

您需要更改为 legacy iptables,声明为 运行 (link):

sudo iptables -F
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo reboot

note that when setting the iptables, do it directly on the pi, not via ssh. You will be kicked out

这样做之后,我所有的服务都很满意,并且可以通过它们定义的 clusterIP 服务名称等从容器内相互 curl。

对于那些不想像我一样在 centos 上使用 k3s 浪费 3 个小时的人,你需要禁用防火墙才能让这些服务相互调用

https://rancher.com/docs/k3s/latest/en/advanced/#additional-preparation-for-red-hat-centos-enterprise-linux

建议关闭firewalld:

systemctl disable firewalld --now

如果启用,需要禁用nm-cloud-setup并重启节点:

systemctl disable nm-cloud-setup.service nm-cloud-setup.timer
reboot

在我禁用它之后,服务能够通过我的配置中的 dns 名称相互调用

仍在寻找禁用防火墙的更好方法,但这取决于 k3s 项目的开发人员