无法通过 SSH 连接到由 Terraform 部署的 Packer 映像创建的 VM

Unable to SSH onto VM created by Packer image deployed by Terraform

λ 加壳版本 封隔器 v1.3.2

Packer 文件:

{
  "builders"                           : [{
    "type"                             : "azure-arm",

    "client_id"                        : "asdf",
    "client_secret"                    : "asdf",
    "tenant_id"                        : "asdf",
    "subscription_id"                  : "asdf",

    "managed_image_resource_group_name": "asdf",
    "managed_image_name"               : "cis-rhel7-l1",

    "os_type"                          : "Linux",
    "image_publisher"                  : "center-for-internet-security-inc",
    "image_offer"                      : "cis-rhel-7-v2-2-0-l1",
    "image_sku"                        : "cis-rhel7-l1",

    "plan_info"                        : {
        "plan_name"                    : "cis-rhel7-l1",
        "plan_product"                 : "cis-rhel-7-v2-2-0-l1",
        "plan_publisher"               : "center-for-internet-security-inc"
    },

    "communicator"                     : "ssh",

    "azure_tags"                       : {
        "docker"                       : "18.09.0"
    },

    "location"                         : "West Europe",
    "vm_size"                          : "Standard_D2_v3"
  }],
  "provisioners"                       : [
        {
            "type"                     : "shell",
            "script"                   : "./cisrhel7-script.sh"
        }
    ]
}

它正在调用的脚本:

DOCKERURL="asdf"

sudo -E sh -c 'echo "asdf/rhel" > /etc/yum/vars/dockerurl'

sudo sh -c 'echo "7" > /etc/yum/vars/dockerosversion'

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

sudo yum-config-manager --enable rhel-7-server-extras-rpm

sudo yum-config-manager --enable rhui-rhel-7-server-rhui-extras-rpms

curl -sSL "asdf/rhel/gpg" -o /tmp/storebits.gpg

sudo rpm --import /tmp/storebits.gpg

sudo -E yum-config-manager --add-repo "asdf/rhel/docker-ee.repo"

sudo yum -y install docker-ee-18.09.0

sudo yum-config-manager --enable docker-ee-stable-18.09

sudo systemctl unmask --now firewalld.service

sudo systemctl enable --now firewalld.service

systemctl status firewalld

list=(
    "22/tcp"
    "80/tcp"
    "179/tcp"
    "443/tcp"
    "2376/tcp"
    "2377/tcp"
    "4789/udp"
    "6443/tcp"
    "6444/tcp"
    "7946/tcp"
    "7946/udp"
    "10250/tcp"
    "12376/tcp"
    "12378/tcp"
    "12379/tcp"
    "12380/tcp"
    "12381/tcp"
    "12382/tcp"
    "12383/tcp"
    "12384/tcp"
    "12385/tcp"
    "12386/tcp"
    "12387/tcp"
    "12388/tcp"
)
for i in "${list[@]}"; do
    sudo firewall-cmd --zone=public --add-port=$i --permanent
done

sudo firewall-cmd --reload

sudo firewall-cmd --list-all

sudo systemctl stop docker

sudo sh -c 'echo "{\"storage-driver\": \"overlay2\"}" > /etc/docker/daemon.json'

CURRENT_USER=$(whoami)

if [ "$CURRENT_USER" != "root" ]
then
        sudo usermod -g docker "$CURRENT_USER"
fi

sudo systemctl start docker

sudo docker info

然后我使用 Terraform 来部署它:

# skipping pre-TF resources...
resource "azurerm_virtual_machine" "main" {
  name                              = "${var.prefix}-vm"
  location                          = "${azurerm_resource_group.main.location}"
  resource_group_name               = "${azurerm_resource_group.main.name}"
  network_interface_ids             = ["${azurerm_network_interface.main.id}"]
  vm_size                           = "Standard_D2_v3"

  delete_os_disk_on_termination     = true

  storage_image_reference {
    id                            = "${data.azurerm_image.custom.id}"
  }

  storage_os_disk {
    name                            = "${var.prefix}-osdisk"
    caching                         = "ReadWrite"
    create_option                   = "FromImage"
    managed_disk_type               = "Standard_LRS"
  }

  os_profile {
    computer_name                   = "${var.prefix}"
    admin_username                  = "rhel76"
  }

  os_profile_linux_config {
    disable_password_authentication = true

    ssh_keys {
      path                          = "/home/rhel76/.ssh/authorized_keys"
      key_data                      = "${file("rhel76.pub")}"
    }
  }

  plan {
      name                          = "cis-rhel7-l1"
      publisher                     = "center-for-internet-security-inc"
      product                       = "cis-rhel-7-v2-2-0-l1"
  }
}

构建正常,部署正常,但是当我去连接时:

λ ssh -i rhel76 rhel76@some-ip
The authenticity of host 'some-ip (some-ip)' can't be established.
ECDSA key fingerprint is SHA256:some-fingerprint.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'some-ip' (ECDSA) to the list of known hosts.
Authorized uses only. All activity may be monitored and reported.
rhel76@some-ip: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

我不确定这是 Packer 还是 Terraform 的问题。我已经通过 Terrraform 部署了基本图像 "cis-rhel7-l1",将图像从我的 更改为基本图像 only,单独留下 ssh 密钥部分,它工作正常(我能够到 SSH OK)。

我可以连接到我的 VM 的唯一方法是在 Azure 中重置 SSH 密钥。我使用 admin_username 作为 rhel76(从模板)重置它并且工作正常,检查 /home/rhel76/.ssh/* 并且东西在那里。显然,因为我刚刚进行了重置。所以再次重建整个东西,没有任何更改,但是当我下次无法登录时,我为随机用户名 asdf 重置了 SSH 密钥,然后查看了 /home/rhel76 目录,找不到 .ssh/./ssh/authorized_keys folder/file,好像它没有创建它们的权限。

从那时起,我就一直在使用脚本,尝试创建这些文件夹并对其进行 CHMOD,以防万一,但这永远行不通,因为我在 Packer 构建过程中遇到错误:

azure-arm: chmod: cannot access ‘/home/rhel76/.ssh/authorized_keys’: Permission denied

有人有什么想法吗?

所以事实证明你需要 运行 一个 'de-provision' 的 Azure Linux 代理,我已经通过在配置部分中加入推荐来完成:

  "provisioners"                       : [
        {
            "type"                     : "shell",
            "script"                   : "./cisrhel7-script.sh"
        },
        {
            "type"                     : "shell",
            "inline"                   : [
                "echo '************ DEPROVISION'",
                "sudo /usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
            ]
        }
    ]
}

取自:https://docs.microsoft.com/en-us/azure/virtual-machines/linux/build-image-with-packer