未设置 TransferEncodingChunked 时 HttpClient 抛出 OutOfMemory 异常

HttpClient throws OutOfMemory exception when TransferEncodingChunked is not set

为了支持上传带有进度报告的大型(实际上非​​常大,高达几千兆字节)文件,我们开始使用带有 PushStreamContent 的 HttpClient,如 here 所述。它工作起来很简单,我们在两个流之间复制字节,这是一个代码示例:

    private void PushContent(Stream src, Stream dest, int length)
    {
        const int bufferLength = 1024*1024*10;
        var buffer = new byte[bufferLength];
        var pos = 0;
        while (pos < length)
        {
            var bytes = Math.Min(bufferLength, length - pos);
            src.Read(buffer, 0, bytes);
            dest.Write(buffer, 0, bytes);
            pos += bufferLength;
            dest.Flush();
            Console.WriteLine($"Transferred {pos} bytes");
        }
        dest.Close();
    }

但一开始这段代码在传输 320 MB 后引发了 OutOfMemory 异常,即使进程的内存消耗不是很高(大约 500 MB)。解决此问题的方法是设置 TransferEncodingChunked 标志:

request.Headers.TransferEncodingChunked = true;

设置此标志后,我们不仅能够传输大文件,而且内存消耗减少了 90%。

我还没有找到任何需要使用 TransferEncodingChunked 的文档,它更像是一个尝试和失败的过程,但在这种情况下它似乎至关重要。我仍然很困惑为什么会抛出异常-内存消耗不是很高,是什么原因造成的?

Chunked transfer encoding

Chunked transfer encoding is a data transfer mechanism in version 1.1 of the Hypertext Transfer Protocol (HTTP) in which data is sent in a series of "chunks". It uses the Transfer-Encoding HTTP header in place of the Content-Length header, which the earlier version of the protocol would otherwise require.1 Because the Content-Length header is not used, the sender does not need to know the length of the content before it starts transmitting a response to the receiver. Senders can begin transmitting dynamically-generated content before knowing the total size of that content.

The size of each chunk is sent right before the chunk itself so that the receiver can tell when it has finished receiving data for that chunk. The data transfer is terminated by a final chunk of length zero.

如果我们从逻辑上考虑,文件是按小块发送的,这意味着当您完成一个块时,您会将其从内存中释放出来。最后你的内存消耗更少,因为你正在处理多个小块。