kubernetes pod kube-dns 不断重启
kubernetes pod kube-dns keeps restarting
我用一个节点设置了一个k8s集群,发现kube-dns pod不断重启:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
kube-dns-1806975333-xjbgr 2/3 CrashLoopBackOff 74 6h
or
kube-dns-1806975333-xjbgr 3/3 Running 106 9h
...
当READY为3/3时,一切正常,但一直以每小时10次左右的速度重启。
我四处搜索并找到了这个问题的几个答案,例如 ,但它们不适用于我。我主机上的文件如下,看起来不错
$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.100.0.10
nameserver 192.168.200.1
$ kubectl -n kube-system get service -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-dns 10.100.0.10 <none> 53/UDP,53/TCP 10h k8s-app=kube-dns
并且日志显示 'Maximum number of concurrent DNS queries reached':
$ kk logs kube-dns-1806975333-xjbgr -c dnsmasq
I0812 10:44:54.206829 2393 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0812 10:44:54.206959 2393 nanny.go:86] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0812 10:44:54.301015 2393 nanny.go:111]
W0812 10:44:54.301050 2393 nanny.go:112] Got EOF from stdout
I0812 10:44:54.301027 2393 nanny.go:108] dnsmasq[2412]: started, version 2.76 cachesize 1000
I0812 10:44:54.301071 2393 nanny.go:108] dnsmasq[2412]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0812 10:44:54.301088 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0812 10:44:54.301093 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0812 10:44:54.301096 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0812 10:44:54.301100 2393 nanny.go:108] dnsmasq[2412]: reading /etc/resolv.conf
I0812 10:44:54.301103 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0812 10:44:54.301120 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0812 10:44:54.301123 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0812 10:44:54.301127 2393 nanny.go:108] dnsmasq[2412]: using nameserver 10.100.0.10#53
I0812 10:44:54.301134 2393 nanny.go:108] dnsmasq[2412]: using nameserver 192.168.200.1#53
I0812 10:44:54.301138 2393 nanny.go:108] dnsmasq[2412]: read /etc/hosts - 7 addresses
I0812 10:44:55.207448 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:05.227722 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:15.243378 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:25.259829 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:35.272106 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:45.293486 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:55.316141 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:46:05.336765 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
我的环境:
$ uname -a
Linux cloudland-master-1 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T07:00:21Z",GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T06:43:48Z",GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
请帮我离开那里。
原来是节点上原来配置的dns服务器IP没有提供dns服务。如果更改为正确的,则症状消失。它接缝 dnsmasq 从 IP 查找外部域名但失败,然后它被杀死。没有关于它的日志,只是偶然发现的。如果你知道背后的原因,请评论它。
我用一个节点设置了一个k8s集群,发现kube-dns pod不断重启:
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
kube-dns-1806975333-xjbgr 2/3 CrashLoopBackOff 74 6h
or
kube-dns-1806975333-xjbgr 3/3 Running 106 9h
...
当READY为3/3时,一切正常,但一直以每小时10次左右的速度重启。
我四处搜索并找到了这个问题的几个答案,例如
$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.100.0.10
nameserver 192.168.200.1
$ kubectl -n kube-system get service -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-dns 10.100.0.10 <none> 53/UDP,53/TCP 10h k8s-app=kube-dns
并且日志显示 'Maximum number of concurrent DNS queries reached':
$ kk logs kube-dns-1806975333-xjbgr -c dnsmasq
I0812 10:44:54.206829 2393 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0812 10:44:54.206959 2393 nanny.go:86] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0812 10:44:54.301015 2393 nanny.go:111]
W0812 10:44:54.301050 2393 nanny.go:112] Got EOF from stdout
I0812 10:44:54.301027 2393 nanny.go:108] dnsmasq[2412]: started, version 2.76 cachesize 1000
I0812 10:44:54.301071 2393 nanny.go:108] dnsmasq[2412]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0812 10:44:54.301088 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0812 10:44:54.301093 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0812 10:44:54.301096 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0812 10:44:54.301100 2393 nanny.go:108] dnsmasq[2412]: reading /etc/resolv.conf
I0812 10:44:54.301103 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0812 10:44:54.301120 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0812 10:44:54.301123 2393 nanny.go:108] dnsmasq[2412]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0812 10:44:54.301127 2393 nanny.go:108] dnsmasq[2412]: using nameserver 10.100.0.10#53
I0812 10:44:54.301134 2393 nanny.go:108] dnsmasq[2412]: using nameserver 192.168.200.1#53
I0812 10:44:54.301138 2393 nanny.go:108] dnsmasq[2412]: read /etc/hosts - 7 addresses
I0812 10:44:55.207448 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:05.227722 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:15.243378 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:25.259829 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:35.272106 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:45.293486 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:45:55.316141 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
I0812 10:46:05.336765 2393 nanny.go:108] dnsmasq[2412]: Maximum number of concurrent DNS queries reached (max: 150)
我的环境:
$ uname -a
Linux cloudland-master-1 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T07:00:21Z",GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T06:43:48Z",GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
请帮我离开那里。
原来是节点上原来配置的dns服务器IP没有提供dns服务。如果更改为正确的,则症状消失。它接缝 dnsmasq 从 IP 查找外部域名但失败,然后它被杀死。没有关于它的日志,只是偶然发现的。如果你知道背后的原因,请评论它。