恢复 kops Kubernetes 集群

Recover kops Kubernetes cluster

有一个由 kops 创建的 kubernetes 集群 当 运行 kops validate 这是输出:

    INSTANCE GROUPS
NAME            ROLE    MACHINETYPE MIN MAX SUBNETS
master-us-east-1a   Master  m4.xlarge   1   1   us-east-1a
nodes           Node    c4.2xlarge  1   75  us-east-1a

NODE STATUS
NAME                ROLE    READY
ip-172-20-59-93.ec2.internal    master  False

VALIDATION ERRORS
KIND    NAME                MESSAGE
Machine i-0a44bbdd18c86e846     machine "i-0a44bbdd18c86e846" has not yet joined cluster
Machine i-0d3302056f3dfeef0     machine "i-0d3302056f3dfeef0" has not yet joined cluster
Machine i-0d6199876b91962f4     machine "i-0d6199876b91962f4" has not yet joined cluster
Node    ip-172-20-59-93.ec2.internal    master "ip-172-20-59-93.ec2.internal" is not ready

Validation Failed

如何恢复这个集群?此集群的 s3 文件可用。

etcd 卷显示状态为 "in-use"

kops 将集群的状态存储在 S3 中。

  1. 找到存储状态的桶
  2. 设置export KOPS_STATE_STORE=s3://your-k8s-state-store
  3. 运行 kops update cluster
  4. 如果失败。
  5. 终止与集群相关的所有实例
  6. 运行 kops create cluster

请注意,您集群的内部状态不在 S3 中,而是在 etcd 中。这里的答案有关于这个主题的更多细节以及如何 backup/restore etcd: How to restore kubernetes cluster using kops?