当 运行 在 IIS 上使用 HttpClient 执行 Hangfire 后台任务时,任务立即被取消,为什么?

When running Hangfire background tasks on IIS with HttpClient then task is imediately cancelled, why?

这有点令人沮丧,其中一种情况 运行 localhost 没有问题,但在部署到 IIS 线程后异常开始蔓延。

无论如何,我将 Hangfire v1.7.11 与 SQLServer 一起使用作为后端存储。

有问题的作业设置为:

    await Task.Run(() =>
        _jobClient.AddOrUpdate<ILiveDataService>(
            notification.BmUnitGuidId.ToString(),
            d => d.UpdateBmUnit(notification.BmUnitGuidId, CancellationToken.None),
            "* * * * *"),
        cancellationToken);

这里的重要部分是根据 Hangfire 文档传入的 CancellationToken.None

ILiveDataService 正在我的 startup.cs 文件中的 HttpClientFactory 中使用 HttpClient 设置,我只是在这里替换为 IDummyClient。这应该进行 baseUri 和身份验证 headers 的通用设置。还有一个瞬态 Http 错误策略来处理不稳定的连接。

    services.AddHttpClient<IDummyClient, DummyClient>(
        c =>
        {
            c.Timeout = TimeSpan.FromMilliseconds(500);
            c.BaseAddress = new Uri(Configuration["DummyClient:Url"]);
            var authInfo = Convert.ToBase64String(Encoding.GetEncoding("ISO-8859-1").GetBytes(Configuration["Dummy:User"] + ":" + Configuration["Dummy:Password"]));
            c.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", authInfo);
        })
        .AddTransientHttpErrorPolicy(builder => builder.WaitAndRetryAsync(new[]
        {
            TimeSpan.FromSeconds(1),
            TimeSpan.FromSeconds(5),
            TimeSpan.FromSeconds(10)
        }));

在 DummyClient 中,调用的方法是:

    public async Task<KeyValuePair<DateTime, double?>> GetValues(string name, CancellationToken cancellationToken)
    {
        var dateFrom = RoundUp(this.DateTimeUtc, TimeSpan.FromMinutes(1));

        using var response = await this._httpClient.GetAsync(
                $"{paramterisedurl}",
                HttpCompletionOption.ResponseHeadersRead,
                cancellationToken);

        var stream = await response.Content.ReadAsStreamAsync();

        if (response.IsSuccessStatusCode)
        {
             var xmlDocument = new XmlDocument();
             xmlDocument.Load(stream);

             // Process horrendous XML response - it's too ugly to share :-)

             return new KeyValuePair<DateTime, double?>(default, default);        
        }

        var content = await StreamToStringAsync(stream);

        throw new ApiException
        {
            StatusCode = (int)response.StatusCode,
            Content = content
        };
    }

据我从 Hangfire 中的异常消息可以看出,作业在 GetAsync() 调用期间正在终止。来自Hangfire的trace如下:

System.Threading.Tasks.TaskCanceledException
The operation was canceled.
System.Threading.Tasks.TaskCanceledException: The operation was canceled.
   at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
   at Polly.AsyncPolicy`1.ExecuteAsync(Func`3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
   at Microsoft.Extensions.Http.PolicyHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   at Infrastructure.Sentinel.SentinelClient.GetBoaPhysicalNotification(String bmUnitName, CancellationToken cancellationToken) in /home/vsts/work/1/s/src/Infrastructure/Sentinel/SentinelClient.cs:line 97
   at ApplicationCore.ApplicationServices.LiveDataService.LiveDataService.UpdateBmUnit(Guid bmUnitGuidId, CancellationToken cancellationToken) in /home/vsts/work/1/s/src/ApplicationCore/ApplicationServices/LiveDataService/LiveDataService.cs:line 81
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)

尽管我发现不寻常的是 Hangfire 显示的作业信息详细说明 CancellationTokennull...

// Job ID: #140
using ApplicationCore.ApplicationServices.LiveDataService;

var liveDataService = Activate<ILiveDataService>();
await liveDataService.UpdateBmUnit(
    FromJson<Guid>("\"fa832ce4-b2a5-47d1-9b04-6ffb52fa0f30\""),
    null);

我想这里有很多问题可能会导致失败,但从根本上讲,似乎 CancellationToken 没有正确地传递到方法中,并且一旦被检查, 在 ConnectAsync 中事情解开。

正如我之前所说,这不会发生在本地主机上...仅在部署时发生。

从根本上说,这是生产服务器未被授权进行与本地主机相同的调用的问题。

然而,客户端抛出的异常被抛出的一般异常所掩盖,因此诊断起来有点棘手。

结果诊断来自登录生产箱并尝试 运行 使用 Curl 的基本 http 请求。

第二课是什么都不假设:-)