Azure 数据工厂 webhook 执行超时而不是中继错误

Azure Data Factory webhook execution times out instead of relaying errors

我尝试在 Azure 数据工厂 (v2) 中设置一个简单的 webhook 执行,为我设置的 Azure Automation Runbook 调用一个简单的 (parameter-less) webhook。

在 Azure 门户中,我可以看到 webhook 正在执行,我的 运行 书正在 运行,目前一切顺利。 运行本书(当前)在执行后 1 分钟内返回错误 - 但没关系,我还想测试失败场景。

问题: 数据工厂似乎不是 'seeing' 错误结果并旋转直到超时(10 分钟)过去。当我启动管道的调试 运行 时,我得到了相同的结果 - 超时但没有错误结果。

更新:我已经修复了 运行 这本书,它现在已成功完成,但数据工厂仍然超时,也没有看到成功响应。

这是设置的屏幕截图:

这里是确认 webhook 正在被 Azure 数据工厂 运行 并在一分钟内完成的门户:

WEBHOOKDATA JSON 是:

{"WebhookName":"Start CAMS VM","RequestBody":"{\r\n \"callBackUri\": \"https://dpeastus.svc.datafactory.azure.com/dataplane/workflow/callback/f7c...df2?callbackUrl=AAEAFF...0927&shouldReportToMonitoring=True&activityType=WebHook\"\r\n}","RequestHeader":{"Connection":"Keep-Alive","Expect":"100-continue","Host":"eab...ddc.webhook.eus2.azure-automation.net","x-ms-request-id":"7b4...2eb"}}

据我所知,事情应该已经到位以接受结果(成功或失败)。希望以前做过这件事的人知道我错过了什么。

谢谢!

我原以为 Azure 会在 Runbook 完成或出错后自动将结果通知 ADF "callBackUri"(因为他们处理了 99% 的脚手架而不需要一行代码)。

事实并非如此,任何希望从 ADF 执行运行手册的人都必须手动从 Webhookdata 输入参数中提取 callBackUri,并在完成后 POST 将结果传递给它。

我还没有确定这一点,因为 Microsoft tutorial sites 我发现有一个坏习惯,就是截取执行此操作的代码的屏幕截图,而不是提供代码本身:

我想我弄清楚后会回来编辑它。


EDIT 我最终通过保持原始 Webhook 不变并创建一个将执行任意 Webhook 的 "wrapper"/helper/utility Runbook 来实现此目的,并且完成后将其状态转发给 ADF。

这是我最终得到的完整代码,以防对其他人有所帮助。它应该是通用的:

设置/辅助函数

param
(
    [Parameter (Mandatory = $false)]
    [object] $WebhookData
)

Import-Module -Name AzureRM.resources
Import-Module -Name AzureRM.automation

# Helper function for getting the current running Automation Account Job
# Inspired heavily by: https://github.com/azureautomation/runbooks/blob/master/Utility/ARM/Find-WhoAmI
<#
    Queries the automation accounts in the subscription to find the automation account, runbook and resource group that the job is running in.
    AUTHOR: Azure/OMS Automation Team
#>
Function Find-WhoAmI {
    [CmdletBinding()]
    Param()
    Begin { Write-Verbose ("Entering {0}." -f $MyInvocation.MyCommand) }
    Process {
        # Authenticate
        $ServicePrincipalConnection = Get-AutomationConnection -Name "AzureRunAsConnection"
        Add-AzureRmAccount `
            -ServicePrincipal `
            -TenantId $ServicePrincipalConnection.TenantId `
            -ApplicationId $ServicePrincipalConnection.ApplicationId `
            -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint | Write-Verbose
        Select-AzureRmSubscription -SubscriptionId $ServicePrincipalConnection.SubscriptionID | Write-Verbose 
        # Search all accessible automation accounts for the current job
        $AutomationResource = Get-AzureRmResource -ResourceType Microsoft.Automation/AutomationAccounts
        $SelfId = $PSPrivateMetadata.JobId.Guid
        foreach ($Automation in $AutomationResource) {
            $Job = Get-AzureRmAutomationJob -ResourceGroupName $Automation.ResourceGroupName -AutomationAccountName $Automation.Name -Id $SelfId -ErrorAction SilentlyContinue
            if (!([string]::IsNullOrEmpty($Job))) {
                return $Job
            }
            Write-Error "Could not find the current running job with id $SelfId"
        }
    }
    End { Write-Verbose ("Exiting {0}." -f $MyInvocation.MyCommand) }
}

Function Get-TimeStamp {    
    return "[{0:yyyy-MM-dd} {0:HH:mm:ss}]" -f (Get-Date)    
}

我的代码


### EXPECTED USAGE ###
# 1. Set up a webhook invocation in Azure data factory with a link to this Runbook's webhook
# 2. In ADF - ensure the body contains { "WrappedWebhook": "<your url here>" }
#    This should be the URL for another webhook.
# LIMITATIONS:
# - Currently, relaying parameters and authentication credentials is not supported,
#    so the wrapped webhook should require no additional authentication or parameters.
# - Currently, the callback to Azure data factory does not support authentication,
#    so ensure ADF is configured to require no authentication for its callback URL (the default behaviour)

# If ADF executed this runbook via Webhook, it should have provided a WebhookData with a request body.
if (-Not $WebhookData) {
    Write-Error "Runbook was not invoked with WebhookData. Args were: $args"
    exit 0
}
if (-Not $WebhookData.RequestBody) {
    Write-Error "WebhookData did not contain a ""RequestBody"" property. Data was: $WebhookData"
    exit 0
}
$parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
# And this data should contain a JSON body containing a 'callBackUri' property.
if (-Not $parameters.callBackUri) {
    Write-Error 'WebhookData was missing the expected "callBackUri" property (which Azure Data Factory should provide automatically)'
    exit 0
}
$callbackuri = $parameters.callBackUri

# Check for the "WRAPPEDWEBHOOK" parameter (which should be set up by the user in ADF)
$WrappedWebhook = $parameters.WRAPPEDWEBHOOK
if (-Not $WrappedWebhook) {
    $ErrorMessage = 'WebhookData was missing the expected "WRAPPEDWEBHOOK" peoperty (which the user should have added to the body via ADF)'
    Write-Error $ErrorMessage
}
else
{
    # Now invoke the actual runbook desired
    Write-Output "$(Get-TimeStamp) Invoking Webhook Request at: $WrappedWebhook"
    try {    
        $OutputMessage = Invoke-WebRequest -Uri $WrappedWebhook -UseBasicParsing -Method POST
    } catch {
        $ErrorMessage = ("An error occurred while executing the wrapped webhook $WrappedWebhook - " + $_.Exception.Message)
        Write-Error -Exception $_.Exception
    }
    # Output should be something like: {"JobIds":["<JobId>"]}
    Write-Output "$(Get-TimeStamp) Response: $OutputMessage"    
    $JobList = (ConvertFrom-Json -InputObject $OutputMessage).JobIds
    $JobId = $JobList[0]
    $OutputMessage = "JobId: $JobId"         

    # Get details about the currently running job, and assume the webhook job is being run in the same resourcegroup/account
    $Self = Find-WhoAmI
    Write-Output "Current Job '$($Self.JobId)' is running in Group '$($Self.ResourceGroupName)' and Automation Account '$($Self.AutomationAccountName)'"
    Write-Output "Checking for Job '$($JobId)' in same Group and Automation Account..."

    # Monitor the job status, wait for completion.
    # Check against a list of statuses that likely indicate an in-progress job
    $InProgressStatuses = ('New', 'Queued', 'Activating', 'Starting', 'Running', 'Stopping')
    # (from https://docs.microsoft.com/en-us/powershell/module/az.automation/get-azautomationjob?view=azps-4.1.0&viewFallbackFrom=azps-3.7.0)  
    do {
        # 1 second between polling attempts so we don't get throttled
        Start-Sleep -Seconds 1
        try { 
            $Job = Get-AzureRmAutomationJob -Id $JobId -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
        } catch {
            $ErrorMessage = ("An error occurred polling the job $JobId for completion - " + $_.Exception.Message)
            Write-Error -Exception $_.Exception
        }
        Write-Output "$(Get-TimeStamp) Polled job $JobId - current status: $($Job.Status)"
    } while ($InProgressStatuses.Contains($Job.Status))

    # Get the job outputs to relay to Azure Data Factory
    $Outputs = Get-AzureRmAutomationJobOutput -Id $JobId -Stream "Any" -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
    Write-Output "$(Get-TimeStamp) Outputs from job: $($Outputs | ConvertTo-Json -Compress)"
    $OutputMessage = $Outputs.Summary
    Write-Output "Summary ouput message: $($OutputMessage)"
}

# Now for the entire purpose of this runbook - relay the response to the callback uri.
# Prepare the success or error response as per specifications at https://docs.microsoft.com/en-us/azure/data-factory/control-flow-webhook-activity#additional-notes
if ($ErrorMessage) {
    $OutputJson = @"
{
    "output": { "message": "$ErrorMessage" },
    "statusCode": 500,
    "error": {
        "ErrorCode": "Error",
        "Message": "$ErrorMessage"
    }
}
"@
} else {
    $OutputJson = @"
{
    "output": { "message": "$OutputMessage" },
    "statusCode": 200
}
"@
}
Write-Output "Prepared ADF callback body: $OutputJson"
# Post the response to the callback URL provided
$callbackResponse = Invoke-WebRequest -Uri $callbackuri -UseBasicParsing -Method POST -ContentType "application/json" -Body $OutputJson

Write-Output "Response was relayed to $callbackuri"
Write-Output ("ADF replied with the response: " + ($callbackResponse | ConvertTo-Json -Compress))

在高层次上,我采取的步骤是:

  1. 执行 "main" Webhook - 取回 "Job Id"
  2. 获取当前 运行 作业的 "context"(资源组和自动化帐户信息)以便我可以轮询远程作业。
  3. 轮询作业直到完成
  4. 以 Azure 数据工厂期望的格式组合 "success" 或 "error" 响应消息。
  5. 调用 ADF 回调。

对于那些正在寻找的人,我创建了上述解决方案的第二种方法 - 从 Webhook 执行 Runbook(带参数),而不是调用嵌套的 Webhook。这有几个好处:

  • 可以将参数传递给 Runbook(而不是要求将参数烘焙到新的 Webhook 中。
  • 可以调用来自另一个 Azure 自动化帐户/资源组的 Runbook。
  • 不需要轮询作业的状态,因为 Start-AzureRmAutomationRunbook commandlet 有一个 -Wait 参数。

代码如下:

param
(
    # Note: While "WebhookData" is the only root-level parameter (set by Azure Data Factory when it invokes this webhook)
    #       The user should ensure they provide (via the ADF request body) these additional properties required to invoke the runbook:
    #       - RunbookName
    #       - ResourceGroupName (TODO: Can fill this in by default if not provided)
    #       - AutomationAccountName (TODO: Can fill this in by default if not provided)
    #       - Parameters (A nested dict containing parameters to forward along)
    [Parameter (Mandatory = $false)]
    [object] $WebhookData
)

Import-Module -Name AzureRM.resources
Import-Module -Name AzureRM.automation

Function Get-TimeStamp {    
    return "[{0:yyyy-MM-dd} {0:HH:mm:ss}]" -f (Get-Date)    
}

# If ADF executed this runbook via Webhook, it should have provided a WebhookData with a request body.
if (-Not $WebhookData) {
    Write-Error "Runbook was not invoked with WebhookData. Args were: $args"
    exit 0
}
if (-Not $WebhookData.RequestBody) {
    Write-Error "WebhookData did not contain a ""RequestBody"" property. Data was: $WebhookData"
    exit 0
}
$parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
# And this data should contain a JSON body containing a 'callBackUri' property.
if (-Not $parameters.callBackUri) {
    Write-Error 'WebhookData was missing the expected "callBackUri" property (which Azure Data Factory should provide automatically)'
    exit 0
}
$callbackuri = $parameters.callBackUri

# Check for required parameters, and output any errors.
$ErrorMessage = ''
$RunbookName = $parameters.RunbookName
$ResourceGroupName = $parameters.ResourceGroupName
$AutomationAccountName = $parameters.AutomationAccountName
if (-Not $RunbookName) {
    $ErrorMessage += 'WebhookData was missing the expected "RunbookName" property (which the user should have added to the body via ADF)`n'
} if (-Not $ResourceGroupName) {
    $ErrorMessage += 'WebhookData was missing the expected "ResourceGroupName" property (which the user should have added to the body via ADF)`n'
} if (-Not $AutomationAccountName) {
    $ErrorMessage += 'WebhookData was missing the expected "AutomationAccountName" property (which the user should have added to the body via ADF)`n'
} if ($ErrorMessage) {
    Write-Error $ErrorMessage
} else {
    # Set the current automation connection's authenticated account to use for future Azure Resource Manager cmdlet requests.
    # TODO: Provide the user with a way to override this if the target runbook doesn't support the AzureRunAsConnection
    $ServicePrincipalConnection = Get-AutomationConnection -Name "AzureRunAsConnection"
    Add-AzureRmAccount -ServicePrincipal `
        -TenantId $ServicePrincipalConnection.TenantId `
        -ApplicationId $ServicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint | Write-Verbose
    Select-AzureRmSubscription -SubscriptionId $ServicePrincipalConnection.SubscriptionID | Write-Verbose 

    # Prepare the properties to pass on to the next runbook - all provided properties exept the ones specific to the ADF passthrough invocation
    $RunbookParams = @{ }
    if($parameters.parameters) {
        $parameters.parameters.PSObject.Properties | Foreach { $RunbookParams[$_.Name] = $_.Value }
        Write-Output "The following parameters will be forwarded to the runbook: $($RunbookParams | ConvertTo-Json -Compress)"
    }

    # Now invoke the actual runbook desired, and wait for it to complete
    Write-Output "$(Get-TimeStamp) Invoking Runbook '$($RunbookName)' from Group '$($ResourceGroupName)' and Automation Account '$($AutomationAccountName)'"
    try {    
        # Runbooks have this nice flag that let you wait on their completion (unlike webhook-invoked)
        $Result = Start-AzureRmAutomationRunbook -Wait -Name $RunbookName -AutomationAccountName $AutomationAccountName -ResourceGroupName $ResourceGroupName –Parameters $RunbookParams
    } catch {
        $ErrorMessage = ("An error occurred while invoking Start-AzAutomationRunbook - " + $_.Exception.Message)
        Write-Error -Exception $_.Exception
    }
    # Digest the result to be relayed to ADF
    if($Result) {
        Write-Output "$(Get-TimeStamp) Response: $($Result | ConvertTo-Json -Compress)"
        $OutputMessage = $Result.ToString()
    } elseif(-Not $ErrorMessage) {
        $OutputMessage = "The runbook completed without errors, but the result was null."
    }
}

# Now for the entire purpose of this runbook - relay the response to the callback uri.
# Prepare the success or error response as per specifications at https://docs.microsoft.com/en-us/azure/data-factory/control-flow-webhook-activity#additional-notes
if ($ErrorMessage) {
    $OutputJson = @{
        output = @{ message = $ErrorMessage }
        statusCode = 500
        error = @{
            ErrorCode = "Error"
            Message = $ErrorMessage
        }
    } | ConvertTo-Json -depth 2
} else {
    $OutputJson = @{
        output = @{ message = $OutputMessage }
        statusCode = 200
    } | ConvertTo-Json -depth 2
}
Write-Output "Prepared ADF callback body: $OutputJson"
# Post the response to the callback URL provided
$callbackResponse = Invoke-WebRequest -Uri $callbackuri -UseBasicParsing -Method POST -ContentType "application/json" -Body $OutputJson

Write-Output "Response was relayed to $callbackuri"
Write-Output ("ADF replied with the response: " + ($callbackResponse | ConvertTo-Json -Compress))