Azure Search .net SDK- 如何使用"FindFailedActionsToRetry"？

Question

使用 Azure Search .net SDK，当您尝试索引文档时，您可能会遇到异常 IndexBatchException。

        try
        {
            var batch = IndexBatch.Upload(documents);
            indexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            // Sometimes when your Search service is under load, indexing will fail for some of the documents in
            // the batch. Depending on your application, you can take compensating actions like delaying and
            // retrying. For this simple demo, we just log the failed document keys and continue.
            Console.WriteLine(
                "Failed to index some of the documents: {0}",
                String.Join(", ", e.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key)));
        }

如何 e.FindFailedActionsToRetry 用于创建新批次以重试失败操作的索引？

我创建了一个这样的函数：

    public void UploadDocuments<T>(SearchIndexClient searchIndexClient, IndexBatch<T> batch, int count) where T : class, IMyAppSearchDocument
    {
        try
        {
            searchIndexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            if (count == 5) //we will try to index 5 times and give up if it still doesn't work.
            {
                throw new Exception("IndexBatchException: Indexing Failed for some documents.");
            }

            Thread.Sleep(5000); //we got an error, wait 5 seconds and try again (in case it's an intermitent or network issue

            var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());
            UploadDocuments(searchIndexClient, retryBatch, count++);
        }
    }

但我认为这部分是错误的：

var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());

Answer 1

FindFailedActionsToRetry 的第二个参数，名为 keySelector，是一个函数，它应该 return 模型类型上的任何属性代表您的文档密钥。在您的示例中，您的模型类型在 UploadDocuments 内的编译时未知，因此您需要更改 UploadsDocuments 以也采用 keySelector 参数并将其传递给 FindFailedActionsToRetry. UploadDocuments 的调用者需要指定特定于类型 T 的 lambda。例如，如果 T 是 this article 中示例代码中的示例 Hotel class，则 lambda 必须是 hotel => hotel.HotelId，因为 HotelId 是用作文档密钥的 Hotel 的属性。

顺便说一下，catch 块中的等待不应等待恒定的时间。如果您的搜索服务负载很重，等待持续的延迟并不能真正帮助它有时间恢复。相反，我们建议按指数方式后退（例如——第一个延迟是 2 秒，然后是 4 秒，然后是 8 秒，然后是 16 秒，直到某个最大值）。

Answer 2

我参加了 and and implemented it using Polly。

指数退避最多一分钟，之后每隔一分钟重试一次。
只要有进展就重试。 5 次请求后超时，没有任何进展。
IndexBatchExceptionis also thrown for unknown documents。我选择忽略此类非暂时性故障，因为它们可能表示不再相关的请求（例如，在单独的请求中删除文档）。

int curActionCount = work.Actions.Count();
int noProgressCount = 0;

await Polly.Policy
    .Handle<IndexBatchException>() // One or more of the actions has failed.
    .WaitAndRetryForeverAsync(
        // Exponential backoff (2s, 4s, 8s, 16s, ...) and constant delay after 1 minute.
        retryAttempt => TimeSpan.FromSeconds( Math.Min( Math.Pow( 2, retryAttempt ), 60 ) ),
        (ex, _) =>
        {
            var batchEx = ex as IndexBatchException;
            work = batchEx.FindFailedActionsToRetry( work, d => d.Id );

            // Verify whether any progress was made.
            int remainingActionCount = work.Actions.Count();
            if ( remainingActionCount == curActionCount ) ++noProgressCount;
            curActionCount = remainingActionCount;
        } )
    .ExecuteAsync( async () =>
    {
        // Limit retries if no progress is made after multiple requests.
        if ( noProgressCount > 5 )
        {
            throw new TimeoutException( "Updating Azure search index timed out." );
        }

        // Only retry if the error is transient (determined by FindFailedActionsToRetry).
        // IndexBatchException is also thrown for unknown document IDs;
        // consider them outdated requests and ignore.
        if ( curActionCount > 0 )
        {
            await _search.Documents.IndexAsync( work );
        }
    } );

Azure Search .net SDK- 如何使用"FindFailedActionsToRetry"？

Azure Search .net SDK- How to use "FindFailedActionsToRetry"?

azure

azure-cognitive-search

azure-search-.net-sdk