calico-node 在工作节点上启动失败

calico-node fails starting on worker node

我正在尝试安装 Kubernetes。

配置详情:

控制器

工人

我使用 coreos-kubernetes 的分支安装控制器和工作器 (https://github.com/kfirufk/coreos-kubernetes)。

kubectl get nodes returns

NAME          STATUS    AGE
192.168.1.2   Ready     4h
192.168.1.3   Ready     4h

kubectl get pods --all-namespaces returns:

NAMESPACE       NAME                                       READY     STATUS      RESTARTS   AGE
ceph            ceph-mds-2743106415-rkww4                  0/1       Pending     0          4h
ceph            ceph-mon-check-3856521781-bd6k5            1/1       Running     0          4h
kube-lego       kube-lego-3323932148-g2tf4                 1/1       Running     0          4h
kube-system     calico-node-xq6j7                          2/2       Running     0          4h
kube-system     calico-node-xzpp2                          0/2       Completed   488        4h
kube-system     calico-policy-controller-610849172-b7xjr   1/1       Running     0          4h
kube-system     heapster-v1.3.0-beta.0-2754576759-v1f50    2/2       Running     0          3h
kube-system     kube-apiserver-192.168.1.2                 1/1       Running     0          4h
kube-system     kube-controller-manager-192.168.1.2        1/1       Running     1          4h
kube-system     kube-dns-3675956729-r7hhf                  3/4       Running     784        4h
kube-system     kube-dns-autoscaler-505723555-l2pph        1/1       Running     973        4h
kube-system     kube-proxy-192.168.1.2                     1/1       Running     0          4h
kube-system     kube-proxy-192.168.1.3                     1/1       Running     0          4h
kube-system     kube-scheduler-192.168.1.2                 1/1       Running     1          4h
kube-system     kubernetes-dashboard-3697905830-vdz23      1/1       Running     262        4h
kube-system     monitoring-grafana-4013973156-m2r2v        1/1       Running     0          4h
kube-system     monitoring-influxdb-651061958-2mdtf        1/1       Running     0          4h
nginx-ingress   default-http-backend-150165654-s4z04       1/1       Running     2          4h

我注意到有两项服务没有完全正常工作..

kube-dns-3675956729-r7hhf - 4 个服务中有 3 个已准备就绪 calico-node-xzpp2 - 工作节点上的 calico 节点 (coreos-3.tux-in.com) 不断重启

kubectl describe pod calico-node-xzpp2 --namespace=kube-system returns:

Name:           calico-node-xzpp2
Namespace:      kube-system
Node:           192.168.1.3/192.168.1.3
Start Time:     Sat, 11 Mar 2017 20:02:02 +0200
Labels:         k8s-app=calico-node
Status:         Running
IP:             192.168.1.3
Controllers:    DaemonSet/calico-node
Containers:
  calico-node:
    Container ID:       rkt://d826868f-e7f5-47af-8d5e-e5779cbc4a19:calico-node
    Image:              quay.io/calico/node:v1.1.0-rc8
    Image ID:           rkt://sha512-a03825f68ef98ab015a46de463e446c70c3ed5ccc1187a09f0cbe5d5bb594953
    Port:
    Command:
      /bin/sh
      -c
    Args:
      mount -o remount,rw /proc/sys && start_runit
    State:              Terminated
      Reason:           Completed
      Exit Code:        0
      Started:          Sun, 12 Mar 2017 00:07:01 +0200
      Finished:         Sun, 12 Mar 2017 00:07:01 +0200
    Last State:         Terminated
      Reason:           Completed
      Exit Code:        0
      Started:          Sun, 12 Mar 2017 00:06:59 +0200
      Finished:         Sun, 12 Mar 2017 00:06:59 +0200
    Ready:              False
    Restart Count:      326
    Volume Mounts:
      /calico-secrets from etcd-certs (rw)
      /lib/modules from lib-modules (rw)
      /var/run/calico from var-run-calico (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zqbdp (ro)
    Environment Variables:
      ETCD_ENDPOINTS:                   <set to the key 'etcd_endpoints' of config map 'calico-config'>
      CALICO_NETWORKING_BACKEND:        <set to the key 'calico_backend' of config map 'calico-config'>
      CALICO_DISABLE_FILE_LOGGING:      true
      NO_DEFAULT_POOLS:                 true
      FELIX_LOGSEVERITYSCREEN:          info
      ETCD_CA_CERT_FILE:                <set to the key 'etcd_ca' of config map 'calico-config'>
      ETCD_KEY_FILE:                    <set to the key 'etcd_key' of config map 'calico-config'>
      ETCD_CERT_FILE:                   <set to the key 'etcd_cert' of config map 'calico-config'>
      IP:
  install-cni:
    Container ID:       rkt://d826868f-e7f5-47af-8d5e-e5779cbc4a19:install-cni
    Image:              quay.io/calico/cni:v1.6.0-4-g76b234c
    Image ID:           rkt://sha512-9a04ebb8ecc83b261e937a2ad1a5abefd09b1573f7c5fb05aafcfda59cc7806b
    Port:
    Command:
      /bin/sh
      -c
    Args:
      export CNI_NETWORK_CONFIG=$(cat /host/cni_network_config/config.conf) && /install-cni.sh
    State:              Terminated
      Reason:           Completed
      Exit Code:        0
      Started:          Sun, 12 Mar 2017 00:07:01 +0200
      Finished:         Sun, 12 Mar 2017 00:07:01 +0200
    Last State:         Terminated
      Reason:           Completed
      Exit Code:        0
      Started:          Sun, 12 Mar 2017 00:06:59 +0200
      Finished:         Sun, 12 Mar 2017 00:06:59 +0200
    Ready:              False
    Restart Count:      326
    Volume Mounts:
      /calico-secrets from etcd-certs (rw)
      /host/cni_network_config from cni-config (rw)
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zqbdp (ro)
    Environment Variables:
      ETCD_ENDPOINTS:   <set to the key 'etcd_endpoints' of config map 'calico-config'>
      CNI_CONF_NAME:    10-calico.conf
Conditions:
  Type          Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  lib-modules:
    Type:       HostPath (bare host directory volume)
    Path:       /lib/modules
  var-run-calico:
    Type:       HostPath (bare host directory volume)
    Path:       /var/run/calico
  cni-bin-dir:
    Type:       HostPath (bare host directory volume)
    Path:       /opt/cni/bin
  cni-net-dir:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/kubernetes/cni/net.d
  etcd-certs:
    Type:       Secret (a volume populated by a Secret)
    SecretName: calico-etcd-secrets
  cni-config:
    Type:       ConfigMap (a volume populated by a ConfigMap)
    Name:       calico-config
  default-token-zqbdp:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-zqbdp
QoS Class:      BestEffort
Tolerations:    CriticalAddonsOnly=:Exists
                dedicated=master:NoSchedule
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath                   Type            Reason  Message
  ---------     --------        -----   ----                    -------------                   --------        ------  -------
  13s   13s     1       {kubelet 192.168.1.3}   spec.containers{install-cni}    Normal  Created         Created with rkt id afedb13c
  13s   13s     1       {kubelet 192.168.1.3}   spec.containers{calico-node}    Normal  Started         Started with rkt id afedb13c

还有很多CreatedStarted eents forinstall-cniandcalico-node`的消息。

kubectl logs calico-node-xzpp2 --namespace=kube-system -c install-cni OR calico-node returns 空输出。

我该如何进一步调查这个问题?

谢谢

我会在 calico-node 启动失败的主机上检查 journalctl -xe。因为你从 kubectl logs calico-node... 什么也得不到,所以听起来容器甚至无法启动。