.Net HttpClient.GetStreamAsync() 的行为与 .GetAsync() 不同

.Net HttpClient.GetStreamAsync() behaves differently to .GetAsync()

我一直在尝试解决在我的应用程序中下载一批图像时遇到的问题。

如果我使用HttpClient.GetStreamAsync(url)批量下载,那么好像有些请求会超时,最终出错。

但是,如果我使用 HttpClient.GetAsync(url),那么整个批次将毫无问题地下载。

我怀疑是调用.GetStreamAsync(url)时端口没有释放有关,不过我可能是在胡说八道。

下面是演示该问题的代码片段。

async Task Main()
{   
    HttpClient httpclient = new HttpClient();
    var imageUrl = "https://tenlives.com.au/wp-content/uploads/2020/09/Found-Kitten-0-8-Weeks-Busy-scaled.jpg";
    var downloadTasks = Enumerable.Range(0, 15)
                                .Select(async u =>
                                {
                                    try
                                    {
                                        //Option 1) - this will fail
                                        var stream = await httpclient.GetStreamAsync(imageUrl);
                                        //End Option 1)

                                        //Option 2) - this will succeed
                                        //var response = await httpclient.GetAsync(imageUrl);
                                        //response.EnsureSuccessStatusCode();
                                        //var stream = await response.Content.ReadAsStreamAsync();
                                        //End Option 2)

                                        return stream;
                                    }
                                    catch (Exception e)
                                    {
                                        Console.WriteLine($"Error downloading image");
                                        throw;
                                    }
                                }).ToList();
    

    try
    {
        await Task.WhenAll(downloadTasks);
    }
    catch (Exception e)
    {       
        Console.WriteLine("================ Failed to download one or more image " + e.Message);
    }
    Console.WriteLine($"Successful downloads: {downloadTasks.Where(t => t.Status == TaskStatus.RanToCompletion).Count()}");
}

linq Select 语句的代码块中,Option 1) 会像上面描述的那样失败。如果你注释掉1),取消选项2)的注释,那么一切都会成功。

任何人都可以解释这里可能发生的事情吗?

编辑:这似乎适用于 .net 核心。我可以使用 .net Framework 4.7.2 及以下版本重现此问题

EDIT2:我还观察到如果我通过添加来增加默认连接限制 ServicePointManager.DefaultConnectionLimit = 30; 然后错误不再发生,但这并不能解释为什么选项 1) 失败但选项 2) 成功

正如@RichardDeeming 所解释的那样,HttpClient.GetStreamAsync 使用 HttpCompletionOption.ResponseHeadersRead 调用 HttpClient.GetAsync

代码可以重写为:

async Task Main()
{   
    HttpClient httpclient = new HttpClient();
    var imageUrl = "https://tenlives.com.au/wp-content/uploads/2020/09/Found-Kitten-0-8-Weeks-Busy-scaled.jpg";
    var downloadTasks = Enumerable.Range(0, 15)
        .Select(async u =>
        {
            try
            {
                //Option 1) - this will fail
                var response = await httpclient.GetAsync(imageUrl, HttpCompletionOption.ResponseHeadersRead);
                //End Option 1)

                //Option 2) - this will succeed
                //var response = await httpclient.GetAsync(imageUrl, HttpCompletionOption.ResponseContentRead);
                //End Option 2)

                response.EnsureSuccessStatusCode();
                var stream = await response.Content.ReadAsStreamAsync();
                return stream;
            }
            catch (Exception e)
            {
                Console.WriteLine($"Error downloading image");
                throw;
            }
        }).ToList();

    try
    {
        await Task.WhenAll(downloadTasks);
    }
    catch (Exception e)
    {       
        Console.WriteLine("================ Failed to download one or more image " + e.Message);
    }
    Console.WriteLine($"Successful downloads: {downloadTasks.Where(t => t.Status == TaskStatus.RanToCompletion).Count()}");
}

接下来 HttpClient.GetAsync 调用 HttpClient.SendAsync。我们可以在 GitHub 上看到该方法的代码:

//I removed the uninterested code for the question
public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken)
{
    TaskCompletionSource<HttpResponseMessage> tcs = new TaskCompletionSource<HttpResponseMessage>();
    client.SendAsync(request, cancellationToken).ContinueWith(task =>
    {
        HttpResponseMessage response = task.Result;
        if(completionOption == HttpCompletionOption.ResponseHeadersRead)
        {
            tcs.TrySetResult(response);
        }
        else
        {
            response.Content.LoadIntoBufferAsync(int.MaxValue).ContinueWith(contentTask =>
            {
                tcs.TrySetResult(response);
            });
        }
    });
    return tcs.Task;
}

使用HttpClient.GetAsync(或SendAsync(HttpCompletionOption.ResponseContentRead)),将网络缓冲区中接收到的内容读取并整合到本地缓冲区中。

我不确定网络缓冲区,但我认为某个地方的缓冲区(网卡,OS,HttpClient,???)已满并阻止新响应。

您可以通过正确管理此缓冲区来更正代码,例如通过处理关联的流:

var downloadTasks = Enumerable.Range(0, 15)
.Select(async u =>
{
    try
    {
        var stream = await httpclient.GetStreamAsync(imageUrl);
        stream.Dispose(); //Free buffer
        return stream;
    }
    catch (Exception e)
    {
        Console.WriteLine($"Error downloading image");
        throw;
    }
}).ToList();

在 .Net Core 中,原始代码无需更正即可工作。 HttpClient class 已被重写并得到改进。