配置多个资源时,terraform 中的 Chef 配置程序挂起

Chef provisioner in terraform hangs when provisioning more than one resource

当使用 Terraform 配置多台机器,并使用 Terraform Chef 配置器配置机器时,只有在 Terraform 运行 中只有一台 "resource" 正在配置时,我才能让它工作.当只针对一个 VM 时,一切都会完美运行。 当配置了多个资源时,厨师 运行 将在 Creating configuration files... 步骤挂起。

我尝试过使用模块,在每个资源内部进行配置,最近使用 null_resources 在创建 vm 资源后进行配置。 (null_resource 已被证明非常有用,因为它允许我快速迭代 chef 运行,而不必每次都重新旋转 VM 资源,就像我在资源块内提供者时所做的那样.)

这发生在 TF 0.11 上,并在 v0.12 中继续:

Terraform v0.12.8
+ provider.null v2.1.2
+ provider.vra7 v0.4.1

资源内的供应商:

resource "vra7_deployment" "vra-vm" {
 ...
  resource_configuration = {
    "vSphere_Machine_1.name" = ""
    "vSphere_Machine_1.ip_address" = ""
    "vSphere_Machine_1.description" = "Terraform ICE SQL"
  }
  ...

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = self.resource_configuration["vSphere_Machine_1.ip_address"]
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    # Note! the version constraint doesn't work
    server_url = var.chef_server_url
    node_name  = "ICE-SQL-${self.resource_configuration["vSphere_Machine_1.name"]}"
    run_list   = var.sql_run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "~> 12"
    user_name  = local.username
    user_key   = file("${local.user_key_path}")
  }

供应商使用 null_resource 块:

resource "vra7_deployment" "ICE-SQL" {
  count = var.sql_count # will be 1/on or 0/off
  ...
  resource_configuration = {
    "vSphere_Machine_1.name" = ""
    "vSphere_Machine_1.ip_address" = ""
    "vSphere_Machine_1.description" = "Terraform ICE SQL"
  }
}

locals {
    sql_ip   = vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.ip_address"]
    sql_name = vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.name"]
  }

resource "null_resource" "sql-chef" { 
  # we can use count to switch creating this on or off for testing
  count = 0

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = local.sql_ip
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    # Don't use the local var here, so TF knows to create the dependency
    server_url = var.chef_server_url
    node_name  = "ICE-SQL-${vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.name"]}"
    run_list   = var.sql_run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "12"
    user_name  = local.username
    user_key   = file("${local.user_key_path}")
    client_options = var.chef_client_options
  }
}

模块

### main.tf
module "SQL" {
  source   = "./modules/vra-chef"
  VRA_USER = var.VRA_USER
  VRA_PASS = var.VRA_PASS
  KT_USER  = var.KT_USER
  KT_PASS  = var.KT_PASS

  description = "ICE SQL"
  run_list    = var.sql_run_list
}

### modules/vra-chef/main.tf
resource "vra7_deployment" "vra-chef" {
  count = var.server_count
...
  resource_configuration = {
    "vSphere_Machine_1.name"       = var.resource_name
    "vSphere_Machine_1.ip_address"  = var.resource_ip
    "vSphere_Machine_1.description" = "${var.description}-${count.index}"
  }

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = self.resource_configuration["vSphere_Machine_1.ip_address"]
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    server_url = var.chef_server_url
    node_name  = self.resource_configuration["vSphere_Machine_1.name"]
    run_list   = var.run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "~> 12"
    user_name  = local.username
    user_key   = file(local.user_key_path)
    client_options = [ "chef_license  'accept'" ]

    # pass custom attributes to the new node
    attributes_json = var.input_json
  }
}

预期结果:

Chef 配置它所应用的所有资源。

实际结果:

Terraform Chef 供应器将连接到它所应用的所有资源,并在客户端上安装 Chef。当它到达 creating configuration files... 步骤时,它停止发送任何更多更新,并且 Terraform 运行 将每 10 秒更新一次状态,每个资源 still creating...

vra7_deployment.ICE-REMOTE[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-SQL[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-MASTER[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-MASTER[0]: Creation complete after 9m39s [id=feecf983-48d5-425e-b713-65a1a05fa3ba]
vra7_deployment.ICE-REMOTE[0]: Still creating... [9m40s elapsed]
vra7_deployment.ICE-SQL[0]: Still creating... [9m40s elapsed]
...
vra7_deployment.ICE-SQL[0]: Still creating... [12m10s elapsed]
vra7_deployment.ICE-REMOTE[0]: Still creating... [12m10s elapsed]
vra7_deployment.ICE-REMOTE[0]: Creation complete after 12m11s [id=df64f5ab-af12-4493-8e7d-d7debd93780d]
vra7_deployment.ICE-SQL[0]: Still creating... [12m20s elapsed]
...
vra7_deployment.ICE-SQL[0]: Still creating... [13m10s elapsed]
vra7_deployment.ICE-SQL[0]: Creation complete after 13m11s [id=08ec31f4-124d-470e-b2ba-1833a6f22792]
null_resource.sql-chef[0]: Creating...
null_resource.master-chef[0]: Creating...
null_resource.remote-chef[0]: Creating...
null_resource.sql-chef[0]: Provisioning with 'chef'...
null_resource.master-chef[0]: Provisioning with 'chef'...
null_resource.remote-chef[0]: Provisioning with 'chef'...
null_resource.master-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.master-chef[0] (chef):   Host: 10.12.235.61
null_resource.master-chef[0] (chef):   Port: 5985
null_resource.master-chef[0] (chef):   User: engineering
null_resource.master-chef[0] (chef):   Password: true
null_resource.master-chef[0] (chef):   HTTPS: false
null_resource.master-chef[0] (chef):   Insecure: true
null_resource.master-chef[0] (chef):   NTLM: false
null_resource.master-chef[0] (chef):   CACert: false
null_resource.sql-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.sql-chef[0] (chef):   Host: 10.12.235.50
null_resource.sql-chef[0] (chef):   Port: 5985
null_resource.sql-chef[0] (chef):   User: engineering
null_resource.sql-chef[0] (chef):   Password: true
null_resource.sql-chef[0] (chef):   HTTPS: false
null_resource.sql-chef[0] (chef):   Insecure: true
null_resource.sql-chef[0] (chef):   NTLM: false
null_resource.sql-chef[0] (chef):   CACert: false
null_resource.remote-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.remote-chef[0] (chef):   Host: 10.12.233.51
null_resource.remote-chef[0] (chef):   Port: 5985
null_resource.remote-chef[0] (chef):   User: engineering
null_resource.remote-chef[0] (chef):   Password: true
null_resource.remote-chef[0] (chef):   HTTPS: false
null_resource.remote-chef[0] (chef):   Insecure: true
null_resource.remote-chef[0] (chef):   NTLM: false
null_resource.remote-chef[0] (chef):   CACert: false
null_resource.sql-chef[0] (chef): Connected!
null_resource.remote-chef[0] (chef): Connected!
null_resource.master-chef[0] (chef): Connected!
null_resource.remote-chef[0] (chef): Downloading Chef Client...
null_resource.sql-chef[0] (chef): Downloading Chef Client...
null_resource.remote-chef[0] (chef): Installing Chef Client...
null_resource.sql-chef[0] (chef): Installing Chef Client...
null_resource.remote-chef[0]: Still creating... [10s elapsed]
null_resource.master-chef[0]: Still creating... [10s elapsed]
null_resource.sql-chef[0]: Still creating... [10s elapsed]
null_resource.sql-chef[0] (chef): Creating configuration files...
null_resource.remote-chef[0] (chef): Creating configuration files...
null_resource.master-chef[0] (chef): Downloading Chef Client...
null_resource.master-chef[0] (chef): Installing Chef Client...
null_resource.master-chef[0] (chef): Creating configuration files...
null_resource.remote-chef[0]: Still creating... [20s elapsed]
null_resource.master-chef[0]: Still creating... [20s elapsed]
null_resource.sql-chef[0]: Still creating... [20s elapsed]
null_resource.remote-chef[0]: Still creating... [30s elapsed]
null_resource.sql-chef[0]: Still creating... [30s elapsed]
null_resource.master-chef[0]: Still creating... [30s elapsed]
null_resource.remote-chef[0]: Still creating... [40s elapsed]
null_resource.sql-chef[0]: Still creating... [40s elapsed]
null_resource.master-chef[0]: Still creating... [40s elapsed]
null_resource.remote-chef[0]: Still creating... [50s elapsed]
null_resource.sql-chef[0]: Still creating... [50s elapsed]
null_resource.master-chef[0]: Still creating... [50s elapsed]
null_resource.remote-chef[0]: Still creating... [1m0s elapsed]
null_resource.sql-chef[0]: Still creating... [1m0s elapsed]
null_resource.master-chef[0]: Still creating... [1m0s elapsed]
...loops waiting forever...

其他上下文:

logged this at Terraform's github,没有任何回应。我在那里的评论:

我发现它似乎不喜欢 chef-provisioning 一次超过一台机器。到目前为止,我发现 4 台机器中有 1 台可以完美配置,而其他机器在打印 creating configuration files... 状态后就挂起。让第一个处于活动状态,在下一个 运行,其他三个将再次挂在同一个地方。最后,我调整了代码以仅重新配置其中一台机器,并且它运行良好。 要清楚:挂在先前 运行 上的完全相同的代码将在 运行 单独执行时完美执行。 我认为这是调试此代码的关键线索.

重申一下:当它卡住时,厨师配置总是挂在 creating configuration files... 步骤。如果它过去了,它总是有效的。

这是厨师 运行 在两个资源上使用 null_provisioner 的要点,两个资源都挂起:https://gist.github.com/mcascone/0b71948f50d52648389e661d00c8e31c

这是成功的 1 资源之一 运行:https://gist.github.com/mcascone/858855b5bd9d5d1cf655d5e10df67801

我一直认为这是同一个供应商在同一个 main.tf 文件中被多次调用的问题。我在一次申请中给厨师供应商打电话 3 次以上 运行。可能是供应商的多个实例相互冲突,或者实际上不支持同一供应商的多个 运行,并且它们都在同一个实例中实例化并破坏每个其他?

看起来,至少现在,我们必须降级到 v0.11 才能让多个配置运行正常工作。请参阅此主题: