使用 terraform 在 AWS EKS 节点上覆盖 "kubernetes.io/hostname"

Override "kubernetes.io/hostname" on AWS EKS node using terraform

我正在尝试使用 Terraform 部署 AWS 环境。 EC2 实例和集群创建良好,但在尝试将实例加入集群时失败。

Error: error waiting for EKS Node Group (env-dev:node_1) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred:
* i-022a2d319d457ab83, i-0374c9efbb32b1f0f, i-05b42da747ca0c8cd, i-08439b352ff4bcc5f, i-0d286addbf2eedd2a, i-0dc6f1bd12b372427, i-0ed373f52f9e27510: NodeCreationFailure: Instances failed to join the kubernetes cluster

根据 CloudWatch 的说法,这就是它失败的地方

"responseObject": {
    "kind": "Status",
    "apiVersion": "v1",
    "metadata": {},
    "status": "Failure",
    "message": "Node \"ip-10-206-68-167.eu-west-1.compute.internal\" is invalid: metadata.labels: Invalid value: \"ip-10-206-68-167.xxxx-xxxxxxxxxxxxxxxxxxxxx-xxxx-xxx.eu-west-1.a\": must be no more than 63 characters",
    "reason": "Invalid",
    "details": {
        "name": "ip-10-206-68-167.eu-west-1.compute.internal",
        "kind": "Node",
        "causes": [
            {
                "reason": "FieldValueInvalid",
                "message": "Invalid value: \"ip-10-206-68-167.xxxx-xxxxxxxxxxxxxxxxxxxxx-xxxx-xxx.eu-west-1.a\": must be no more than 63 characters",
                "field": "metadata.labels"
            }
        ]
    },
    "code": 422
}

有什么方法可以使用 terraform 覆盖主机名吗?

编辑:

terraform 的片段:

variable "db_subnet_ids" { default = ["subnet-04f6e659f2b2851f2", "subnet-0a42b2ec54b5aa143"] }

resource "aws_eks_cluster" "cluster" {
 enabled_cluster_log_types = [
  "api",
  "audit",
  "authenticator",
  "controllerManager",
  "scheduler",
 ]
 name = "env-${var.suffix}"
 role_arn = aws_iam_role.eks_cluster_role.arn
 vpc_config {
  subnet_ids = var.db_subnet_ids
  security_group_ids = [aws_security_group.cluster_http.id, aws_security_group.cluster_https.id]
 }
  
 depends_on = [
  aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy,
  aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy,
  aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly,
 ]
}

resource "aws_eks_node_group" "cluster" {
 cluster_name = aws_eks_cluster.cluster.name
 node_group_name = "node_1"
 node_role_arn = aws_iam_role.eks_nodes_role.arn
 subnet_ids = var.db_subnet_ids
 scaling_config {
  desired_size = 7
  max_size = 7
  min_size = 7
 }
 ami_type = "AL2_x86_64"
 capacity_type = "ON_DEMAND"
 disk_size = 50
 instance_types = ["t3.large"]

 depends_on = [
  aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy,
  aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy,
  aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly,
 ]
}

VPC是手动创建的,所以配置如下:

子网 ID:subnet-04f6e659f2b2851f2
名称:xxxx-xxxxxxxxxxxxxxxxxxxxxxxx-xxxx-xxx-private-eu-west-1a
IPv4 CIDR:10.206.68.160/28

子网 ID:subnet-0a42b2ec54b5aa143
姓名:xxxx-xxxxxxxxxxxxxxxxxxxxxxxx-xxxx-xxx-private-eu-west-1b
IPv4 CIDR:10.206.68.176/28

和 VPC
姓名:xxxx-xxxxxxxxxxxxxxxxxxxxx-xxxx-xxx
VIP ID:vpc-09a3e88350a018cbf
IPv4 网段:10.206.68.160/27 DHCP 选项集 ID:dopt-0ea3823bed3d5ff2c

DHCP 选项设置
编号:dopt-0ea3823bed3d5ff2c
姓名:xxxx-xxxxxxxxxxxxxxxxxxxxxxxx-xxxx-xxxx

名称后来更改为:xxxx-xx-xxxx-xxxx(第 2 个单词缩写) 在我下面的回答中解释。

问题已有所解决。不确定我是否喜欢该解决方案,但它确实有效。

因此节点主机名是通过附加

创建的
  • IP 地址
  • DHCP 选项设置来自 VPC 的名称
  • 不确定“eu-west-1”和“a”的来源。
    也许是 VPC 的区域?还有来自子网的“a”?

最后是这样的:

ip-10-206-68-167.xxxx-xxxxxxxxxxxxxxxxxxxxx-xxxx-xxx.eu-west-1.a

因此,通过缩短 DHCP 选项集名称,主机名也因此缩短,一切正常。