coredns 是 运行 但在 conjure-up k8s cdk 之后还没有准备好
coredns is running but not ready after conjure-up k8s cdk
我已经使用 conjure-up(使用仿生)部署了 Kubernetes V1.18.2
(CDK)
更新: 完全销毁了上面的环境,然后在此处使用 CDK 捆绑包手动重新部署它 https://jaas.ai/canonical-kubernetes,相同的 K8S 版本相同的 OS 版本(Ubuntu 18.04) 没有区别。
coredns
正在通过 /etc/resolv.conf
解析,请参阅下面的 configmap
:
Name: coredns
Namespace: kube-system
Labels: cdk-addons=true
Annotations:
Data
====
Corefile:
----
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
Events: <none>
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues 这里有一个关于 /etc/resolv.conf
而不是 /run/systemd/resolve/resolv.conf
的已知问题
我编辑了 coredns
配置映射以将其指向 /run/systemd/resolve/resolv.conf
但设置被还原了。
我也尝试将 kubelet-extra-config
设置为 {resolvConf: /run/systemd/resolve/resolv.conf}
,重新启动服务器,没有变化:
kubelet-extra-config:
default: '{}'
description: |
Extra configuration to be passed to kubelet. Any values specified in this
config will be merged into a KubeletConfiguration file that is passed to
the kubelet service via the --config flag. This can be used to override
values provided by the charm.
Requires Kubernetes 1.10+.
The value for this config must be a YAML mapping that can be safely
merged with a KubeletConfiguration file. For example:
{evictionHard: {memory.available: 200Mi}}
For more information about KubeletConfiguration, see upstream docs:
https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
source: user
type: string
value: '{resolvConf: /run/systemd/resolve/resolv.conf}'
但是在根据 https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
检查配置时,我可以看到 kubelet
配置的变化
...
"resolvConf": "/run/systemd/resolve/resolv.conf",
...
这是我在 coredns pod 中遇到的错误:
E0429 09:16:42.172959 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Endpoints: Get https://10.152.183.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.152.183.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
查看 kubernetes 服务:
default kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 4h42m <none>
这是 coredns
部署:
Name: coredns
Namespace: kube-system
CreationTimestamp: Wed, 29 Apr 2020 09:15:07 +0000
Labels: cdk-addons=true
cdk-restart-on-ca-change=true
k8s-app=kube-dns
kubernetes.io/name=CoreDNS
Annotations: deployment.kubernetes.io/revision: 1
Selector: k8s-app=kube-dns
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 1 max unavailable, 25% max surge
Pod Template:
Labels: k8s-app=kube-dns
Service Account: coredns
Containers:
coredns:
Image: rocks.canonical.com:443/cdk/coredns/coredns-amd64:1.6.7
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
Priority Class Name: system-cluster-critical
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: coredns-6b59b8bd9f (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set coredns-6b59b8bd9f to 1
有人能帮忙吗?
更多信息:
K8S SVC 配置正确:
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.152.183.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: xx.xx.xx.xx:6443,xx.xx.xx.yy:6443
Session Affinity: None
Events: <none>
我可以使用 --insecure
卷曲两个 IP 地址
描述 EP:
kubectl describe ep kubernetes
Name: kubernetes
Namespace: default
Labels: <none>
Annotations: <none>
Subsets:
Addresses: xx.xx.xx.xx,xx.xx.xx.yy
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
https 6443 TCP
Events: <none>
其他更多发现:看起来 juju 在 CDK
部署期间创建的大多数 vnet
都不是 运行。我怀疑这是因为 apparmor
(基于 https://jaas.ai/canonical-kubernetes/bundle/21 )
Note: If you desire to deploy this bundle locally on your laptop, see the
segment about Conjure-Up under Alternate Deployment Methods. Default deployment
via juju will not properly adjust the apparmor profile to support running
kubernetes in LXD. At this time, it is a necessary intermediate deployment
mechanism.
7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:f0:0c:29 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fef0:c29/64 scope link
valid_lft forever preferred_lft forever
70: vnet12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:00:a3:94 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe00:a394/64 scope link
valid_lft forever preferred_lft forever
72: vnet13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:15:17:f4 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe15:17f4/64 scope link
valid_lft forever preferred_lft forever
74: vnet14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:ec:5c:72 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feec:5c72/64 scope link
valid_lft forever preferred_lft forever
76: vnet15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:60:79:18 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe60:7918/64 scope link
valid_lft forever preferred_lft forever
79: vnet16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:67:ff:14 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe67:ff14/64 scope link
valid_lft forever preferred_lft forever
81: vnet17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:96:71:01 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe96:7101/64 scope link
valid_lft forever preferred_lft forever
83: vnet18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:a8:1d:b7 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fea8:1db7/64 scope link
valid_lft forever preferred_lft forever
85: vnet19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:2a:89:c1 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe2a:89c1/64 scope link
valid_lft forever preferred_lft forever
87: vnet20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:4e:ce:fb brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe4e:cefb/64 scope link
valid_lft forever preferred_lft forever
89: vnet21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:93:55:ac brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe93:55ac/64 scope link
valid_lft forever preferred_lft forever
90: vnet22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:b7:ae:b2 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feb7:aeb2/64 scope link
valid_lft forever preferred_lft forever
另一个新更新:
我尝试了 xenial 部署并注意到 /etc/resolv.conf
已正确配置,没有任何问题,但问题仍然存在
原来flannel
与我的本地网络冲突,部署前在juju的bundle.yaml
中指定如下:
applications:
flannel:
options:
cidr: 10.2.0.0/16
一劳永逸地解决了问题! :)
我已经使用 conjure-up(使用仿生)部署了 Kubernetes V1.18.2
(CDK)
更新: 完全销毁了上面的环境,然后在此处使用 CDK 捆绑包手动重新部署它 https://jaas.ai/canonical-kubernetes,相同的 K8S 版本相同的 OS 版本(Ubuntu 18.04) 没有区别。
coredns
正在通过 /etc/resolv.conf
解析,请参阅下面的 configmap
:
Name: coredns
Namespace: kube-system
Labels: cdk-addons=true
Annotations:
Data
====
Corefile:
----
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
Events: <none>
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues 这里有一个关于 /etc/resolv.conf
而不是 /run/systemd/resolve/resolv.conf
我编辑了 coredns
配置映射以将其指向 /run/systemd/resolve/resolv.conf
但设置被还原了。
我也尝试将 kubelet-extra-config
设置为 {resolvConf: /run/systemd/resolve/resolv.conf}
,重新启动服务器,没有变化:
kubelet-extra-config:
default: '{}'
description: |
Extra configuration to be passed to kubelet. Any values specified in this
config will be merged into a KubeletConfiguration file that is passed to
the kubelet service via the --config flag. This can be used to override
values provided by the charm.
Requires Kubernetes 1.10+.
The value for this config must be a YAML mapping that can be safely
merged with a KubeletConfiguration file. For example:
{evictionHard: {memory.available: 200Mi}}
For more information about KubeletConfiguration, see upstream docs:
https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
source: user
type: string
value: '{resolvConf: /run/systemd/resolve/resolv.conf}'
但是在根据 https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
检查配置时,我可以看到kubelet
配置的变化
...
"resolvConf": "/run/systemd/resolve/resolv.conf",
...
这是我在 coredns pod 中遇到的错误:
E0429 09:16:42.172959 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Endpoints: Get https://10.152.183.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.152.183.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
查看 kubernetes 服务:
default kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 4h42m <none>
这是 coredns
部署:
Name: coredns
Namespace: kube-system
CreationTimestamp: Wed, 29 Apr 2020 09:15:07 +0000
Labels: cdk-addons=true
cdk-restart-on-ca-change=true
k8s-app=kube-dns
kubernetes.io/name=CoreDNS
Annotations: deployment.kubernetes.io/revision: 1
Selector: k8s-app=kube-dns
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 1 max unavailable, 25% max surge
Pod Template:
Labels: k8s-app=kube-dns
Service Account: coredns
Containers:
coredns:
Image: rocks.canonical.com:443/cdk/coredns/coredns-amd64:1.6.7
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
Priority Class Name: system-cluster-critical
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: coredns-6b59b8bd9f (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 11m deployment-controller Scaled up replica set coredns-6b59b8bd9f to 1
有人能帮忙吗?
更多信息: K8S SVC 配置正确:
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.152.183.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: xx.xx.xx.xx:6443,xx.xx.xx.yy:6443
Session Affinity: None
Events: <none>
我可以使用 --insecure
卷曲两个 IP 地址描述 EP:
kubectl describe ep kubernetes
Name: kubernetes
Namespace: default
Labels: <none>
Annotations: <none>
Subsets:
Addresses: xx.xx.xx.xx,xx.xx.xx.yy
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
https 6443 TCP
Events: <none>
其他更多发现:看起来 juju 在 CDK
部署期间创建的大多数 vnet
都不是 运行。我怀疑这是因为 apparmor
(基于 https://jaas.ai/canonical-kubernetes/bundle/21 )
Note: If you desire to deploy this bundle locally on your laptop, see the segment about Conjure-Up under Alternate Deployment Methods. Default deployment via juju will not properly adjust the apparmor profile to support running kubernetes in LXD. At this time, it is a necessary intermediate deployment mechanism.
7: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:f0:0c:29 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fef0:c29/64 scope link
valid_lft forever preferred_lft forever
70: vnet12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:00:a3:94 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe00:a394/64 scope link
valid_lft forever preferred_lft forever
72: vnet13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:15:17:f4 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe15:17f4/64 scope link
valid_lft forever preferred_lft forever
74: vnet14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:ec:5c:72 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feec:5c72/64 scope link
valid_lft forever preferred_lft forever
76: vnet15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:60:79:18 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe60:7918/64 scope link
valid_lft forever preferred_lft forever
79: vnet16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:67:ff:14 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe67:ff14/64 scope link
valid_lft forever preferred_lft forever
81: vnet17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:96:71:01 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe96:7101/64 scope link
valid_lft forever preferred_lft forever
83: vnet18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:a8:1d:b7 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fea8:1db7/64 scope link
valid_lft forever preferred_lft forever
85: vnet19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:2a:89:c1 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe2a:89c1/64 scope link
valid_lft forever preferred_lft forever
87: vnet20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:4e:ce:fb brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe4e:cefb/64 scope link
valid_lft forever preferred_lft forever
89: vnet21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:93:55:ac brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:fe93:55ac/64 scope link
valid_lft forever preferred_lft forever
90: vnet22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br1 state UNKNOWN group default qlen 1000
link/ether fe:54:00:b7:ae:b2 brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:feb7:aeb2/64 scope link
valid_lft forever preferred_lft forever
另一个新更新:
我尝试了 xenial 部署并注意到 /etc/resolv.conf
已正确配置,没有任何问题,但问题仍然存在
原来flannel
与我的本地网络冲突,部署前在juju的bundle.yaml
中指定如下:
applications:
flannel:
options:
cidr: 10.2.0.0/16
一劳永逸地解决了问题! :)