Azure 服务总线不断抛出 MessageLockLostExceptions

Azure Service Bus keeps throwing MessageLockLostExceptions

我在处理消息时不断收到 MessageLockLostExceptions

现在我想通过添加 Task.Delay(10_000) 来模拟稍微长一点的 运行 消息处理任务(但仍在 LockDuration 之内)。但随后我每收到 4 条消息就会收到 MessageLockLostException

即使我设置 MaxAutoRenewDuration = TimeSpan.FromDays(30)PrefetchCount = 0 也会发生这种情况。


这是消息的处理方式,我稍微改了一下打印出剩余的锁时长:

    private static async Task processMessagesAsync(Message message, CancellationToken token)
    {
        Console.Write($"Received message: {message.SystemProperties.SequenceNumber}. Remaining lock duration: {message.SystemProperties.LockedUntilUtc - DateTime.UtcNow}");
        await Task.Delay(10000);
        await queueClient.CompleteAsync(message.SystemProperties.LockToken);
        Console.WriteLine(" - Complete!");
    }

示例输出:

======================================================
Press ENTER key to exit after receiving all the messages.
======================================================
Received message: 3659174697238584. Remaining lock duration: 00:00:30.8269132 - Complete!
Received message: 19421773393035331. Remaining lock duration: 00:00:20.5271654 - Complete!
Received message: 11540474045136941. Remaining lock duration: 00:00:10.3372697 - Complete!
Received message: 15762598695796784. Remaining lock duration: 00:00:00.1776760
Message handler encountered an exception     Microsoft.Azure.ServiceBus.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. Reference:2c6caac3-e607-4130-a522-f75e4636e130, TrackingId:3ff82738-664d-4aca-b55f-ba3900f1c640_B17, SystemTracker:ocgtesting:queue:workflow~63, Timestamp:2018-12-12T17:01:59
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnRenewLockAsync(String lockToken) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\Core\MessageReceiver.cs:line 1260
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass74_0.<<RenewLockAsync>b__0>d.MoveNext() in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\Core\MessageReceiver.cs:line 771
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\RetryPolicy.cs:line 83
at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\RetryPolicy.cs:line 105
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.RenewLockAsync(String lockToken) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\Core\MessageReceiver.cs:line 773
at Microsoft.Azure.ServiceBus.Core.MessageReceiver.RenewLockAsync(Message message) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\Core\MessageReceiver.cs:line 742
at Microsoft.Azure.ServiceBus.MessageReceivePump.RenewMessageLockTask(Message message, CancellationToken renewLockCancellationToken) in C:\source\azure-service-bus-dotnet\src\Microsoft.Azure.ServiceBus\MessageReceivePump.cs:line 248.

完整代码在这里:https://pastebin.com/sFGBgE0s

您需要在 Lock Token 过期之前 Complete 消息。锁定令牌过期后,您将在整个操作过程中收到 MessageLockLostException

我看到您将每条消息的线程执行延迟 10 秒。但是消息似乎是在同一时间点获取的,这就是为什么每条消息的剩余锁定持续时间不断减少的原因。

对于第四条消息,剩余锁定持续时间为00:00:00.1776760。所以,在177 milliseconds之后,锁就会过期。您在下一行中将线程延迟了 10 seconds。所以,锁会过期,你会得到 MessageLockLostException。为避免此异常,请删除 Delay

您的重现中缺少的一件事是队列描述。请务必注意此类细节,因为您遇到的问题与客户端无关,很可能与代理或底层 AMQP 库有关。

对于非分区队列,此设置工作正常。它不适用于分区队列(标准层)。新老客户都可以观察到。我已经提出了一个与代理相关的 issue 供 Azure 服务总线团队进行调查。