Azure Durable Functions 的 CallActivityWithRetryAsync 不会在失败时重试

Azure Durable Functions' CallActivityWithRetryAsync does not retry on failure

在我的编排器功能中,我将 请求 文件上传到外部服务器。经过一段不确定的时间后,应该会生成一个 response 文件。我需要轮询并下载此文件。

我目前的做法是上传后等待10分钟。然后使用内置 CallActivityWithRetryAsyncRetryOptions。在第一次 poll/download 失败后,等待 5 分钟,然后再开始总共 10 次重试。只有在 activity 函数中抛出消息 RESPONSE_FILE_NOT_YET_AVAILABLE 的异常时才应尝试重试。

        var nowIn10Minutes = ctx.CurrentUtcDateTime.AddMinutes(10);
        await ctx.CreateTimer(nowIn10Minutes, CancellationToken.None);

        const string RETRY_ERROR_MESSAGE = "RESPONSE_FILE_NOT_YET_AVAILABLE";
        var retryOptions = new RetryOptions(TimeSpan.FromMinutes(5), 10)
        {
            Handle = ex => ex.Message == RETRY_ERROR_MESSAGE
        };
        await ctx.CallActivityWithRetryAsync(nameof(PollForResponseAndDownload), retryOptions, input);

但是,根据日志,此重试逻辑未被接受。见下文。

在计时器中设置的等待 10 分钟后,编排立即失败并出现 FunctionFailedException。尽管日志中显示了正确的异常消息,但未执行重试。

我是不是从根本上误解了这个过程?以下是相关日志:

->上传请求后,等待10分钟

2022-01-31 00:00:06.740 <GUID>: Function 'MyOrchestrator (Orchestrator)' is waiting for input. Reason: CreateTimer:2022-01-31T00:10:06.5093237Z. IsReplay: False. State: Listening. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 112.
2022-01-31 00:00:06.741 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 113.

-> 10 分钟后恢复,安排 activity 函数执行

2022-01-31 00:10:32.700 <GUID>: Function 'MyOrchestrator (Orchestrator)' was resumed by a timer scheduled for '2022-01-31T00:10:06.5093237Z'. IsReplay: False. State: TimerExpired. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 114.
2022-01-31 00:10:32.701 <GUID>: Function 'PollForResponseAndDownload (Activity)' scheduled. Reason: MyOrchestrator. IsReplay: False. State: Scheduled. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 115.
2022-01-31 00:10:32.701 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 116.

-> 启动 activity 函数。它立即以预期的 ex.Message 失败,但仍然无法 运行 重试逻辑。

2022-01-31 00:10:32.715 <GUID>: Function 'PollForResponseAndDownload (Activity)' started. IsReplay: False. Input: (368 bytes). State: Started. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 117. TaskEventId: 5
2022-01-31 00:10:37.078 <GUID>: Function 'PollForResponseAndDownload (Activity)' failed with an error. Reason: System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at MyNamespace.func._getResponseFileContents(String fileHeader) in C:\Users\me\source\AppName\func.cs:line ...
2022-01-31 00:10:37.364 <GUID>: Function 'MyOrchestrator (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Extensions.DurableTask.FunctionFailedException: The activity function 'PollForResponseAndDownload' failed: "RESPONSE_FILE_NOT_YET_AVAILABLE". See the function execution logs for additional details. ---> System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at ...

这里是一些 link 相关的讨论。您能否尝试根据此 link 重新检查您的函数应用程序以解决您的问题。

  • CallActivityWithRetryAsync 调用时,DurableOrchestrationContext 调用 ScheduleWithRetry method of the OrchestrationContext class inside the DurableTask framework
  • RetryInterceptor class 上的 Invoke 方法被调用,并在最大重试次数上执行 foreach 循环。 class 不公开获取重试次数的属性或方法。

activity 函数将“Activity 函数 'SomeActivityFunc' 失败:”添加到消息中。因此,要么创建要抛出的自定义异常类型并检查类型,要么使用 .Contains 或检查“Activity function 'SomeActivityFunc' failed: RESPONSE_FILE_NOT_YET_AVAILABLE”。

Handle = ex => ex.Message == "Activity function 'SomeActivityFunc' failed: " + RETRY_ERROR_MESSAGE
Handle = ex => ex.Message.Contains(RETRY_ERROR_MESSAGE)
Handle = ex => ex is SomeCustomExceptionType