K3S 集群在 Rancher 仪表板中处于待处理状态

K3S cluster is pending in Rancher dashboard

我已经用 K3S 安装了一个 3 节点集群。 kubectl 正确检测到节点,我能够部署图像。

$ k3s kubectl get nodes
NAME                   STATUS   ROLES                       AGE     VERSION
master                 Ready    control-plane,etcd,master   4h31m   v1.22.2+k3s1
worker-01              Ready    <none>                      3h59m   v1.22.2+k3s1
worker-02              Ready    <none>                      4h3m    v1.22.2+k3s1

我还安装了 Rancher 最新版本(2.6.0) 通过 docker-compose:

version: '2'
services:
  rancher:
    image: rancher/rancher:latest
    restart: always
    ports:
    - "8080:80/tcp"
    - "4443:443/tcp"
    volumes:
    - "rancher-data:/var/lib/rancher"
    privileged: true
volumes:
  rancher-data:

仪表板可从每个节点访问,我已经导入了一个现有集群,运行以下命令:

curl --insecure -sfL https://192.168.1.100:4443/v3/import/66txfzmv4fnw6bqj99lpmdt6jlx4rpwblzhx96wvljc8gczphcn2c2_c-m-nz826pgl.yaml | kubectl apply -f -

集群显示为 Active 但有 0 个节点和消息:

[Pending] waiting for full cluster configuration

完整的 yaml 状态在这里:

apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
  annotations:
    field.cattle.io/creatorId: user-5bk6w
  creationTimestamp: "2021-10-05T10:06:35Z"
  finalizers:
  - wrangler.cattle.io/provisioning-cluster-remove
  generation: 1
  managedFields:
  - apiVersion: provisioning.cattle.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .: {}
          v:"wrangler.cattle.io/provisioning-cluster-remove": {}
      f:spec: {}
      f:status:
        .: {}
        f:clientSecretName: {}
        f:clusterName: {}
        f:conditions: {}
        f:observedGeneration: {}
        f:ready: {}
    manager: rancher
    operation: Update
    time: "2021-10-05T10:08:30Z"
  name: ofb
  namespace: fleet-default
  resourceVersion: "73357"
  uid: 1d03f05e-77b7-4361-947d-2ef5b50928f5
spec: {}
status:
  clientSecretName: ofb-kubeconfig
  clusterName: c-m-nz826pgl
  conditions:
  - lastUpdateTime: "2021-10-05T10:08:30Z"
    status: "False"
    type: Reconciling
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "False"
    type: Stalled
  - lastUpdateTime: "2021-10-05T14:08:52Z"
    status: "True"
    type: Created
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: RKECluster
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: BackingNamespaceCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: DefaultProjectCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: SystemProjectCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: InitialRolesPopulated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: CreatorMadeOwner
  - lastUpdateTime: "2021-10-05T10:08:15Z"
    status: "True"
    type: Pending
  - lastUpdateTime: "2021-10-05T10:08:15Z"
    message: waiting for full cluster configuration
    reason: Pending
    status: "True"
    type: Provisioned
  - lastUpdateTime: "2021-10-05T14:08:52Z"
    message: Waiting for API to be available
    status: "True"
    type: Waiting
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: NoDiskPressure
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: NoMemoryPressure
  - lastUpdateTime: "2021-10-05T10:06:39Z"
    status: "False"
    type: Connected
  - lastUpdateTime: "2021-10-05T14:04:52Z"
    status: "True"
    type: Ready
  observedGeneration: 1
  ready: true

集群代理未显示任何特殊问题:

$ kubectl -n cattle-system logs -l app=cattle-cluster-agent
time="2021-10-05T13:54:30Z" level=info msg="Connecting to wss://192.168.1.100:4443/v3/connect with token starting with 66txfzmv4fnw6bqj99lpmdt6jlx"
time="2021-10-05T13:54:30Z" level=info msg="Connecting to proxy" url="wss://192.168.1.100:4443/v3/connect"

我需要做些什么才能使集群完全 运行 吗?我试图将 Rancher 版本降级到 2.5.0 但我遇到了同样的问题。

这个问题通常是因为 cattle-cluster-agent 无法连接到配置的服务器-url。另外,让我告诉您,从 Rancher 2.5 和更新版本开始,cattle-node-agents 仅存在于使用 RKE 在 Rancher 中创建的集群中。您可以访问以下 URL,其中包含与您的问题相关的官方 Rancher 文档,并按照那里的说明解决问题:

Rancher Registered clusters

我认为这与 Kubernetes v1.22 不兼容。

在导入新的 v1.22.2 集群时遇到与 Rancher v2.6.0 相同的问题(运行 在 IBM Cloud VPC 基础设施上),我跟踪了 Docker 容器的日志 运行牧场主并观察到:

2021/10/20 14:40:31 [INFO] Starting cluster controllers for c-m-cs78tnxc
E1020 14:40:31.346373      33 reflector.go:139] pkg/mod/github.com/rancher/client-go@v0.21.0-rancher.1/tools/cache/reflector.go:168: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource (get ingresses.meta.k8s.io)

Kubernetes v1.22 updates NGINX-Ingress to v1.x, which appears to be the cause, and there is an open issue on the Rancher GitHub 更新它以与 Kubernetes v1.22 兼容。

同时,在同一基础设施上使用 Kubernetes v1.21.5 重新创建新集群后,我能够成功将其导入 Rancher。