Kubernetes 仪表板和 SSL - x509:无法加载系统根且未提供根

Kubernetes dashboard and SSL - x509: failed to load system roots and no roots provided

嗨,我一直在努力让 kubernetes 工作好几天了,我学到了很多东西,但我仍然在与仪表板作斗争。我无法在我的 CoreOS 机器上使用它。

我收到的消息是: 在初始化与 Kubernetes apiserver 的连接时。这很可能意味着集群配置错误(例如,它具有无效的 apiserver 证书或服务帐户配置)或 --apiserver-host 参数指向不存在的服务器。原因:获取 https://146.185.XXX.XXX:443/version:x509:加载系统根目录失败且未提供根目录

我不知道要测试证书是否真的是问题所在。我简直不敢相信,因为我可以在我的 worker 机器上成功使用 curl。另一方面,我想知道仪表板如何知道要使用哪些证书?

我真的尽力为您提供正确的信息,如果您需要更多信息,我会将其添加到这张票中。

所以一切似乎都可以接受仪表板。

core@amanda ~ $ ./bin/kubectl get pods --namespace=kube-system     
NAME                                      READY     STATUS             RESTARTS   AGE
kube-apiserver-146.185.XXX.XXX            1/1       Running            0          3h
kube-controller-manager-146.185.XXX.XXX   1/1       Running            0          3h
kube-dns-v11-nb4aa                        4/4       Running            0          1h
kube-proxy-146.185.YYY.YYY                1/1       Running            0          1h
kube-proxy-146.185.XXX.XXX                1/1       Running            0          3h
kube-scheduler-146.185.XXX.XXX            1/1       Running            0          3h
kubernetes-dashboard-2597139800-hg5ik     0/1       CrashLoopBackOff   21         1h

kubernetes-dashboard 容器日志:

core@amanda ~ $ ./bin/kubectl logs kubernetes-dashboard-2597139800-hg5ik  --namespace=kube-system
Starting HTTP server on port 9090
Creating API server client for https://146.185.XXX.XXX:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://146.185.XXX.XXX:443/version: x509: failed to load system roots and no roots provided

使用证书的 Curl 调用成功

core@amanda ~ $ curl -v --cert /etc/kubernetes/ssl/worker.pem --key /etc/kubernetes/ssl/worker-key.pem --cacert /etc/ssl/certs/ca.pem https://146.185.XXX.XXX:443/version
*   Trying 146.185.XXX.XXX...
* Connected to 146.185.XXX.XXX (146.185.XXX.XXX) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca.pem
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS handshake, CERT verify (15):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
...
> GET /version HTTP/1.1
> Host: 146.185.XXX.XXX
> User-Agent: curl/7.47.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< Date: Sun, 24 Jul 2016 11:19:38 GMT
< Content-Length: 269
< 
{
  "major": "1",
  "minor": "3",
  "gitVersion": "v1.3.2+coreos.0",
  "gitCommit": "52a0d5141b1c1e7449189bb0be3374d610eb98e0",
  "gitTreeState": "clean",
  "buildDate": "2016-07-19T17:45:13Z",
  "goVersion": "go1.6.2",
  "compiler": "gc",
  "platform": "linux/amd64"
* Connection #0 to host 146.185.XXX.XXX left intact
}

仪表板部署设置:

./bin/kubectl edit deployment kubernetes-dashboard --namespace=kube-system

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "3"
  creationTimestamp: 2016-07-19T22:27:24Z
  generation: 36
  labels:
    app: kubernetes-dashboard
    version: v1.1.0
  name: kubernetes-dashboard
  namespace: kube-system
  resourceVersion: "553126"
  selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/kubernetes-dashboard
  uid: f7793d2f-4dff-11e6-b31e-04012dd8e901
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kubernetes-dashboard
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: kubernetes-dashboard
    spec:
      containers:
      - args:
        - --apiserver-host=https://146.185.XXX.XXX:443
        image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: 9090
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 30
        name: kubernetes-dashboard
        ports:
        - containerPort: 9090
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  observedGeneration: 36
  replicas: 1
  unavailableReplicas: 1
  updatedReplicas: 1

仪表板服务设置:

./bin/kubectl edit deployment kubernetes-dashboard --namespace=kube-system

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2016-07-19T22:27:24Z
  labels:
    app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
  resourceVersion: "408001"
  selfLink: /api/v1/namespaces/kube-system/services/kubernetes-dashboard
  uid: f7a57f1a-4dff-11e6-b31e-04012dd8e901
spec:
  clusterIP: 10.3.0.80
  ports:
  - nodePort: 30009
    port: 80
    protocol: TCP
    targetPort: 9090
  selector:
    app: kubernetes-dashboard
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

工作节点上的 Kube 代理设置:

core@amanda ~ $ cat /etc/kubernetes/manifests/kube-proxy.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: kube-proxy
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-proxy
    image: quay.io/coreos/hyperkube:v1.3.2_coreos.0
    command:
    - /hyperkube
    - proxy
    - "--master=https://146.185.XXX.XXX"
    - "--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml"
    - "--proxy-mode=iptables"
    securityContext:
      privileged: true
    volumeMounts:
      - mountPath: /etc/ssl/certs
        name: "ssl-certs"
      - mountPath: /etc/kubernetes/worker-kubeconfig.yaml
        name: "kubeconfig"
        readOnly: true
      - mountPath: /etc/kubernetes/ssl
        name: "etc-kube-ssl"
        readOnly: true
  volumes:
    - name: "ssl-certs"
      hostPath:
        path: "/usr/share/ca-certificates"
    - name: "kubeconfig"
      hostPath:
        path: "/etc/kubernetes/worker-kubeconfig.yaml"
    - name: "etc-kube-ssl"
      hostPath:
        path: "/etc/kubernetes/ssl"

Worker kube 配置(/etc/kubernetes/worker-kubeconfig.yaml)

core@amanda ~ $ cat /etc/kubernetes/worker-kubeconfig.yaml
apiVersion: v1
kind: Config
clusters:
- name: local
  cluster:
    certificate-authority: /etc/kubernetes/ssl/ca.pem
users:
- name: kubelet
  user:
    client-certificate: /etc/kubernetes/ssl/worker.pem
    client-key: /etc/kubernetes/ssl/worker-key.pem
contexts:
- context:
    cluster: local
    user: kubelet
  name: kubelet-context
current-context: kubelet-context

您可以将具有 token/ssl 配置的 kubeconfig 分配给仪表板。

然后根据您的安装,您可能需要安装 kubeconfig 和证书。

apiVersion: v1
kind: ReplicationController
metadata:
  name: kubernetes-dashboard-v1.1.0-beta3
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    version: v1.1.0-beta3
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
        version: v1.1.0-beta3
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - name: kubernetes-dashboard
        image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.1.0
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        **env:
        - name: KUBECONFIG
          value: /etc/kubernetes/kubeconfig**
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: "etcpki"
          mountPath: "/etc/pki"
          readOnly: true
        - name: "config"
          mountPath: "/etc/kubernetes"
          readOnly: true
        livenessProbe:
          httpGet:
            path: /
            port: 9090
          initialDelaySeconds: 30
          timeoutSeconds: 30
      volumes:
      - name: "etcpki"
        hostPath:
          path: "/etc/pki"
      - name: "config"
        hostPath:
          path: "/etc/kubernetes"

我遇到了这个问题,最后用service account的认证策略解决了。我将仪表板的命名空间更改为 "default",因此它可以使用 k8s 创建的默认服务帐户。 这份 troubleshooting 文档非常有用。