如何修复失败的 VMSS 部署并出现错误 "unknown network allocation error"

How to fix failed VMSS deployment with error "unknown network allocation error"

我正在尝试使用 Azure PowerShell CLI 和带参数的自定义 ARM 模板将 3 层架构部署到 Azure。我对 powershell 脚本或模板的有效性没有任何问题。

在模板中,除其他外还有两个虚拟机规模集,一个用于前端,一个用于后端。前端是windows,后端是red hat。前端在应用程序网关后面,而后端在负载均衡器后面。奇怪的是前端VMSS部署没有问题,一切正常。每次我尝试部署后端 VMSS 时,它都会失败,并带有一条模糊的 "Unknown network allocation error" 消息,我不知道如何调试(因为它没有提供与我目前所有其他错误消息不同的细节)。

我将 ARM 模板基于另一个资源组中此体系结构的工作模型导出的模板,并修改了参数,并花了一段时间清理 Azure 导出模板的问题和错误。我试过删除并从头开始,但似乎无法解决此问题。我认为我可能达到了免费订阅处理器的限制,所以我尝试使前端 VMSS 依赖于后端 VMSS,以便首先创建后端 VMSS,但同样的问题仍然发生。

这是模板的后端 VMSS 部分:

{
      "type": "Microsoft.Compute/virtualMachineScaleSets",
      "apiVersion": "2018-10-01",
      "name": "[parameters('virtualMachineScaleSets_JakeAppBESS_name')]",
      "location": "westus2",
      "dependsOn": [
        "[parameters('loadBalancers_JakeAppBESSlb_name')]"
      ],
      "sku": {
        "name": "Standard_B1ls",
        "tier": "Standard",
        "capacity": 1
      },
      "properties": {
        "singlePlacementGroup": true,
        "upgradePolicy": {
          "mode": "Manual"
        },
        "virtualMachineProfile": {
          "osProfile": {
            "computerNamePrefix": "jakeappbe",
            "adminUsername": "Jake",
            "adminPassword": "[parameters('JakeApp_Password')]",
            "linuxConfiguration": {
              "disablePasswordAuthentication": false,
              "provisionVMAgent": true
            },
            "secrets": []
          },
          "storageProfile": {
            "osDisk": {
              "createOption": "FromImage",
              "caching": "ReadWrite",
              "managedDisk": {
                "storageAccountType": "Premium_LRS"
              }
            },
            "imageReference": {
              "publisher": "RedHat",
              "offer": "RHEL",
              "sku": "7.4",
              "version": "latest"
            }
          },
          "networkProfile": {
            "networkInterfaceConfigurations": [
              {
                "name": "[concat(parameters('virtualMachineScaleSets_JakeAppBESS_name'), 'Nic')]",
                "properties": {
                  "primary": true,
                  "enableAcceleratedNetworking": false,
                  "dnsSettings": {
                    "dnsServers": []
                  },
                  "enableIPForwarding": false,
                  "ipConfigurations": [
                    {
                      "name": "[concat(parameters('virtualMachineScaleSets_JakeAppBESS_name'), 'IpConfig')]",
                      "properties": {
                        "subnet": {
                          "id": "[concat('/subscriptions/', parameters('subscription_id'), '/resourceGroups/', parameters('resource_Group'), '/providers/Microsoft.Network/virtualNetworks/', parameters('virtualNetworks_JakeAppVnet_name'), '/subnets/BEsubnet')]"
                        },
                        "privateIPAddressVersion": "IPv4",
                        "loadBalancerBackendAddressPools": [
                          {
                            "id": "[concat('/subscriptions/', parameters('subscription_id'), '/resourceGroups/', parameters('resource_Group'), '/providers/Microsoft.Network/loadBalancers/', parameters('loadBalancers_JakeAppBESSlb_name'), '/backendAddressPools/bepool')]"
                          }
                        ],
                        "loadBalancerInboundNatPools": [
                          {
                            "id": "[concat('/subscriptions/', parameters('subscription_id'), '/resourceGroups/', parameters('resource_Group'), '/providers/Microsoft.Network/loadBalancers/', parameters('loadBalancers_JakeAppBESSlb_name'), '/inboundNatPools/natpool')]"
                          }
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          },
          "priority": "Regular"
        },
        "overprovision": true
      }
    },


For reference, here's the front-end VMSS's part of the template so you can compare and see that there aren't many differences:

`    {
      "type": "Microsoft.Compute/virtualMachineScaleSets",
      "apiVersion": "2018-10-01",
      "name": "[parameters('virtualMachineScaleSets_JakeAppFESS_name')]",
      "location": "westus2",
      "dependsOn": [
        "[parameters('applicationGateways_JakeAppFE_AG_name')]",
      ],
      "sku": {
        "name": "Standard_B1ls",
        "tier": "Standard",
        "capacity": 1
      },
      "properties": {
        "singlePlacementGroup": true,
        "upgradePolicy": {
          "mode": "Manual"
        },
        "virtualMachineProfile": {
          "osProfile": {
            "computerNamePrefix": "jakeappfe",
            "adminUsername": "Jake",
            "adminPassword": "[parameters('JakeApp_Password')]",
            "windowsConfiguration": {
              "provisionVMAgent": true,
              "enableAutomaticUpdates": true
            },
            "secrets": []
          },
          "storageProfile": {
            "osDisk": {
              "createOption": "FromImage",
              "caching": "ReadWrite",
              "managedDisk": {
                "storageAccountType": "Premium_LRS"
              }
            },
            "imageReference": {
              "publisher": "MicrosoftWindowsServer",
              "offer": "WindowsServer",
              "sku": "2016-Datacenter",
              "version": "latest"
            }
          },
          "networkProfile": {
            "networkInterfaceConfigurations": [
              {
                "name": "[concat(parameters('virtualMachineScaleSets_JakeAppFESS_name'), 'Nic')]",
                "properties": {
                  "primary": true,
                  "enableAcceleratedNetworking": false,
                  "dnsSettings": {
                    "dnsServers": []
                  },
                  "enableIPForwarding": false,
                  "ipConfigurations": [
                    {
                      "name": "[concat(parameters('virtualMachineScaleSets_JakeAppFESS_name'), 'IpConfig')]",
                      "properties": {
                        "subnet": {
                          "id": "[concat('/subscriptions/', parameters('subscription_id'), '/resourceGroups/', parameters('resource_Group'), '/providers/Microsoft.Network/virtualNetworks/', parameters('virtualNetworks_JakeAppVnet_name'), '/subnets/FEsubnet')]"
                        },
                        "privateIPAddressVersion": "IPv4",
                        "applicationGatewayBackendAddressPools": [
                          {
                            "id": "[concat('/subscriptions/', parameters('subscription_id'), '/resourceGroups/', parameters('resource_Group'), '/providers/Microsoft.Network/applicationGateways/', parameters('applicationGateways_JakeAppFE_AG_name'), '/backendAddressPools/appGatewayBackendPool')]"
                          }
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          },
          "priority": "Regular"
        },
        "overprovision": true
      }
    },

我希望他们的行为相似。诚然,后端是 RH linux,而前端是 windows,前端在应用程序网关后面,而后端在负载均衡器后面,但此设置运行良好在我通过门户而不是通过 ARM 部署的其他资源组中很好。但每次我尝试部署它时,我都会收到此错误:

New-AzureRmResourceGroupDeployment : 1:30:56 AM - Resource Microsoft.Compute/virtualMachineScaleSets 'ProdBESS' failed with message '{
  "status": "Failed",
  "error": {
    "code": "ResourceDeploymentFailure",
    "message": "The resource operation completed with terminal provisioning state 'Failed'.",
    "details": [
      {
        "code": "NetworkingInternalOperationError",
        "message": "Unknown network allocation error."
      }
    ]
  }
}'

好吧,我终于弄明白问题出在哪里了,所以如果有人在以后搜索时发现这个帖子有同样的错误:

显然,处理 VMSS 负载均衡器的模板部分(从 Azure 门户导出)有两个相互冲突的入站 nat 池(重叠端口范围)。一旦我删除了创建冲突的额外 nat 池的模板部分,我的 VMSS 就正确部署了,没有问题。

完全不知道为什么 azure 门户给我导出了一个模板,其中包含一个从未存在过的额外 nat 池(我从中导出模板的原始 LB 上只有 1 个)。