重启后 Kubernetes 集群没有 运行
Kubernetes cluster does not run after reboot
如果我在重启后使用 kubectl 命令,我将收到错误消息。
x.x.x.x: 6443 被拒绝-您是否指定了正确的主机或端口?
如果我用 docker ps 检查我的容器,kube-apiserver 和 kube-scheduler 被打开和关闭。
为什么会这样?
root@taeil-linux:/etc/systemd/system/kubelet.service.d# cd
root@taeil-linux:~# kubectl get nodes
The connection to the server 10.0.0.152:6443 was refused - did you specify the right host or port?
root@taeil-linux:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root@taeil-linux:~# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.15.3 232b5c793146 2 weeks ago 82.4MB
k8s.gcr.io/kube-apiserver v1.15.3 5eb2d3fc7a44 2 weeks ago 207MB
k8s.gcr.io/kube-scheduler v1.15.3 703f9c69a5d5 2 weeks ago 81.1MB
k8s.gcr.io/kube-controller-manager v1.15.3 e77c31de5547 2 weeks ago 159MB
node carbon c83f74dcf58e 3 weeks ago 895MB
kubernetesui/dashboard v2.0.0-beta1 4640949a39e6 2 months ago 64.6MB
weaveworks/weave-kube 2.5.2 f04a043bb67a 3 months ago 148MB
weaveworks/weave-npc 2.5.2 5ce48e0d813c 3 months ago 49.6MB
kubernetesui/metrics-scraper v1.0.0 44390ebe2b73 4 months ago 36.8MB
k8s.gcr.io/coredns 1.3.1 eb516548c180 7 months ago 40.3MB
k8s.gcr.io/etcd 3.3.10 2c4adeb21b4f 9 months ago 258MB
quay.io/coreos/flannel v0.10.0-amd64 f0fad859c909 19 months ago 44.6MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
root@taeil-linux:~# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Fri 2019-09-06 14:29:25 KST; 4min 19s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 14470 (kubelet)
Tasks: 19 (limit: 4512)
CGroup: /system.slice/kubelet.service
└─14470 /usr/bin/kubelet --bootstrap- kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf -- kubeconfig=/etc/kubernetes/kubelet.conf -- config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network- plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-con
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.800330 14470 pod_workers.go:190] Error syncing pod 9a745ac0a776afabd0d387fd0fcb2f54 ("kube-apiserver-taeil-linux_kube- system(9a745ac0a776afabd0d387fd0fcb2f54)"), skipping: failed to "CreatePodSandbox" for "kube-apiserver-ta
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.897945 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.916566 14470 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.152:6443/api/v1/pods? fieldSelector=spec.nodeName%3Dtaeil-linux&limit=500&resourceVersion=0: dia
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.998190 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.098439 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.198732 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.299052 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.399343 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.499561 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.599723 14470 kubelet.go:2248] node "taeil-linux" not found
root@taeil-linux:~# systemctl status kube-apiserver
Unit kube-apiserver.service could not be found.
如果我尝试
docker 日志
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I0906 10:54:19.636649 1 server.go:560] external host was not specified, using 10.0.0.152
I0906 10:54:19.636954 1 server.go:147] Version: v1.15.3
I0906 10:54:21.753962 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I0906 10:54:21.753988 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
E0906 10:54:21.754660 1 prometheus.go:55] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754701 1 prometheus.go:68] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754787 1 prometheus.go:82] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754842 1 prometheus.go:96] failed to register workDuration metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754883 1 prometheus.go:112] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754918 1 prometheus.go:126] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754952 1 prometheus.go:152] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754986 1 prometheus.go:164] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755047 1 prometheus.go:176] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755104 1 prometheus.go:188] failed to register work_duration metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755152 1 prometheus.go:203] failed to register unfinished_work_seconds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755188 1 prometheus.go:216] failed to register longest_running_processor_microseconds metric admission_quota_controller: duplicate metrics collector registration attempted
I0906 10:54:21.755215 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesBy Condition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObj ectInUseProtection,MutatingAdmissionWebhook.
I0906 10:54:21.755226 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,Validating AdmissionWebhook,ResourceQuota.
I0906 10:54:21.757263 1 client.go:354] parsed scheme: ""
I0906 10:54:21.757280 1 client.go:354] scheme "" not registered, fallback to default scheme
I0906 10:54:21.757335 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I0906 10:54:21.757402 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W0906 10:54:21.757666 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I0906 10:54:22.753069 1 client.go:354] parsed scheme: ""
I0906 10:54:22.753118 1 client.go:354] scheme "" not registered, fallback to default scheme
I0906 10:54:22.753204 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I0906 10:54:22.753354 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W0906 10:54:22.753855 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:22.757983 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:23.754019 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:24.430000 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:25.279869 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:26.931974 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:28.198719 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:30.825660 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:32.850511 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:36.294749 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:38.737408 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
F0906 10:54:41.757603 1 storage_decorator.go:57] Unable to create storage backend: config (&{ /registry {[https://127.0.0.1:2379] /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/etcd/ca.crt} true 0xc00063dd40 apiextensions.k8s.io/v1beta1 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:2379: connect: connection refused)
答案在@cewood 的评论中;
Okay, that helps to understand what you installation is likely to look
like. Regarding the other master components, these are likely running
via the kubelet, and hence there won't be any systemd units for them,
only for the kubelet itself.
使用 kubeadm 安装您看不到服务;
作为 root
systemctl start docker
systemctl start kubectl
切换到非root用户
su 非根用户 -
kubectl get pods
好久不见
我彻底明白怎么解决这个问题了!
如果您无缘无故遇到这样的错误,您可以通过以下方式修复:
docker rm $(docker ps -a -q)
Perhaps 重新启动现有 Kubernetes 容器时发生错误,新 运行 容器崩溃。
watch docker ps
用watch查看容器,可以看到1分钟内kube-apiserver等已经关闭
所以我决定删除 docker ps -a 中出现的所有容器,它已修复!
如果我在重启后使用 kubectl 命令,我将收到错误消息。 x.x.x.x: 6443 被拒绝-您是否指定了正确的主机或端口?
如果我用 docker ps 检查我的容器,kube-apiserver 和 kube-scheduler 被打开和关闭。
为什么会这样?
root@taeil-linux:/etc/systemd/system/kubelet.service.d# cd
root@taeil-linux:~# kubectl get nodes
The connection to the server 10.0.0.152:6443 was refused - did you specify the right host or port?
root@taeil-linux:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
root@taeil-linux:~# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.15.3 232b5c793146 2 weeks ago 82.4MB
k8s.gcr.io/kube-apiserver v1.15.3 5eb2d3fc7a44 2 weeks ago 207MB
k8s.gcr.io/kube-scheduler v1.15.3 703f9c69a5d5 2 weeks ago 81.1MB
k8s.gcr.io/kube-controller-manager v1.15.3 e77c31de5547 2 weeks ago 159MB
node carbon c83f74dcf58e 3 weeks ago 895MB
kubernetesui/dashboard v2.0.0-beta1 4640949a39e6 2 months ago 64.6MB
weaveworks/weave-kube 2.5.2 f04a043bb67a 3 months ago 148MB
weaveworks/weave-npc 2.5.2 5ce48e0d813c 3 months ago 49.6MB
kubernetesui/metrics-scraper v1.0.0 44390ebe2b73 4 months ago 36.8MB
k8s.gcr.io/coredns 1.3.1 eb516548c180 7 months ago 40.3MB
k8s.gcr.io/etcd 3.3.10 2c4adeb21b4f 9 months ago 258MB
quay.io/coreos/flannel v0.10.0-amd64 f0fad859c909 19 months ago 44.6MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 20 months ago 742kB
root@taeil-linux:~# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Fri 2019-09-06 14:29:25 KST; 4min 19s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 14470 (kubelet)
Tasks: 19 (limit: 4512)
CGroup: /system.slice/kubelet.service
└─14470 /usr/bin/kubelet --bootstrap- kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf -- kubeconfig=/etc/kubernetes/kubelet.conf -- config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network- plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-con
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.800330 14470 pod_workers.go:190] Error syncing pod 9a745ac0a776afabd0d387fd0fcb2f54 ("kube-apiserver-taeil-linux_kube- system(9a745ac0a776afabd0d387fd0fcb2f54)"), skipping: failed to "CreatePodSandbox" for "kube-apiserver-ta
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.897945 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.916566 14470 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.152:6443/api/v1/pods? fieldSelector=spec.nodeName%3Dtaeil-linux&limit=500&resourceVersion=0: dia
9월 06 14:33:44 taeil-linux kubelet[14470]: E0906 14:33:44.998190 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.098439 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.198732 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.299052 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.399343 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.499561 14470 kubelet.go:2248] node "taeil-linux" not found
9월 06 14:33:45 taeil-linux kubelet[14470]: E0906 14:33:45.599723 14470 kubelet.go:2248] node "taeil-linux" not found
root@taeil-linux:~# systemctl status kube-apiserver
Unit kube-apiserver.service could not be found.
如果我尝试 docker 日志
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I0906 10:54:19.636649 1 server.go:560] external host was not specified, using 10.0.0.152
I0906 10:54:19.636954 1 server.go:147] Version: v1.15.3
I0906 10:54:21.753962 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I0906 10:54:21.753988 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
E0906 10:54:21.754660 1 prometheus.go:55] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754701 1 prometheus.go:68] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754787 1 prometheus.go:82] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754842 1 prometheus.go:96] failed to register workDuration metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754883 1 prometheus.go:112] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754918 1 prometheus.go:126] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754952 1 prometheus.go:152] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.754986 1 prometheus.go:164] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755047 1 prometheus.go:176] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755104 1 prometheus.go:188] failed to register work_duration metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755152 1 prometheus.go:203] failed to register unfinished_work_seconds metric admission_quota_controller: duplicate metrics collector registration attempted
E0906 10:54:21.755188 1 prometheus.go:216] failed to register longest_running_processor_microseconds metric admission_quota_controller: duplicate metrics collector registration attempted
I0906 10:54:21.755215 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesBy Condition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObj ectInUseProtection,MutatingAdmissionWebhook.
I0906 10:54:21.755226 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,Validating AdmissionWebhook,ResourceQuota.
I0906 10:54:21.757263 1 client.go:354] parsed scheme: ""
I0906 10:54:21.757280 1 client.go:354] scheme "" not registered, fallback to default scheme
I0906 10:54:21.757335 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I0906 10:54:21.757402 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W0906 10:54:21.757666 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I0906 10:54:22.753069 1 client.go:354] parsed scheme: ""
I0906 10:54:22.753118 1 client.go:354] scheme "" not registered, fallback to default scheme
I0906 10:54:22.753204 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I0906 10:54:22.753354 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W0906 10:54:22.753855 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:22.757983 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:23.754019 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:24.430000 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:25.279869 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:26.931974 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:28.198719 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:30.825660 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:32.850511 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:36.294749 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W0906 10:54:38.737408 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
F0906 10:54:41.757603 1 storage_decorator.go:57] Unable to create storage backend: config (&{ /registry {[https://127.0.0.1:2379] /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/etcd/ca.crt} true 0xc00063dd40 apiextensions.k8s.io/v1beta1 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:2379: connect: connection refused)
答案在@cewood 的评论中;
Okay, that helps to understand what you installation is likely to look like. Regarding the other master components, these are likely running via the kubelet, and hence there won't be any systemd units for them, only for the kubelet itself.
使用 kubeadm 安装您看不到服务;
作为 root
systemctl start docker
systemctl start kubectl
切换到非root用户 su 非根用户 -
kubectl get pods
好久不见
我彻底明白怎么解决这个问题了!
如果您无缘无故遇到这样的错误,您可以通过以下方式修复:
docker rm $(docker ps -a -q)
Perhaps 重新启动现有 Kubernetes 容器时发生错误,新 运行 容器崩溃。
watch docker ps
用watch查看容器,可以看到1分钟内kube-apiserver等已经关闭
所以我决定删除 docker ps -a 中出现的所有容器,它已修复!