DeflateStream.ReadAsync (.NET 4.5 System.IO.Compression) 读取字节的 return 值与等效读取方法不同?

DeflateStream.ReadAsync (.NET 4.5 System.IO.Compression) has different return value of bytes read than equivalent Read method?

在将一些较旧的代码转换为在 c# 中使用异步时,我开始发现来自 DeflateStream 的 Read() 和 ReadAsync() 方法的 return 值变体存在问题。

我认为同步代码的转换像

bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);

相当于

的异步版本

bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);

应该始终return相同的值。


查看添加到问题底部的更新代码 - 以正确的方式使用流 - 因此使初始问题变得无关紧要


我发现在多次迭代后这并不成立,在我的特定情况下导致转换后的应用程序出现随机错误。

我是不是漏掉了什么?

下面是简单的重现案例(在控制台应用程序中),其中 Assert 将在迭代 #412 的 ReadAsync 方法中为我中断,给出如下所示的输出:

....
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync #412 - 453 bytes read
---- DEBUG ASSERTION FAILED ----

我的问题是,为什么 DeflateStream.ReadAsync 方法此时 return 占用 453 个字节?

注意:这只发生在某些输入字符串上 - CreateProblemDataString 中的大量 StringBuilder 内容是我能想到的为此 post 构建字符串的最佳方式。

class Program
{
    static byte[] DataAsByteArray;
    static int uncompressedSize;

    static void Main(string[] args)
    {
        string problemDataString = CreateProblemDataString();
        DataAsByteArray = Encoding.ASCII.GetBytes(problemDataString);
        uncompressedSize = DataAsByteArray.Length;
        MemoryStream memoryStream = new MemoryStream();
        using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress, true))
        {
            for (int i = 0; i < 1000; i++)
            {
                deflateStream.Write(DataAsByteArray, 0, uncompressedSize);
            }
        }

        // now read it back synchronously
        Read(memoryStream);

        // now read it back asynchronously
        Task retval = ReadAsync(memoryStream);
        retval.Wait();
    }

    static void Read(MemoryStream memoryStream)
    {
        memoryStream.Position = 0;
        using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
        {
            byte[] buffer = new byte[uncompressedSize];
            int bytesRead = -1;
            int i = 0;
            while (bytesRead > 0 || bytesRead == -1)
            {
                bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
                System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
                System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
                i++;
            }
        }
    }

    static async Task ReadAsync(MemoryStream memoryStream)
    {
        memoryStream.Position = 0;
        using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
        {
            byte[] buffer = new byte[uncompressedSize];
            int bytesRead = -1;
            int i = 0;
            while (bytesRead > 0 || bytesRead == -1)
            {
                bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
                System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
                System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
                i++;
            }
        }
    }

    /// <summary>
    /// This is one of the strings of data that was causing issues. 
    /// </summary>
    /// <returns></returns>
    static string CreateProblemDataString()
    {
        StringBuilder sb = new StringBuilder();
        sb.Append("0601051081                                      ");
        sb.Append("                                                                       ");
        sb.Append("                         225021         0300420");
        sb.Append("34056064070072076361102   13115016017");
        sb.Append("5      192         230237260250   2722");
        sb.Append("73280296      326329332   34535535");
        sb.Append("7   3                                                                  ");
        sb.Append("                                                                    4");
        sb.Append("                                                                             ");
        sb.Append("                                                         50");
        sb.Append("6020009      030034045   063071076   360102   13");
        sb.Append("1152176160170   208206      23023726025825027227328");
        sb.Append("2283285   320321333335341355357   622005009      0");
        sb.Append("34053      060070      361096   130151176174178172208");
        sb.Append("210198   235237257258256275276280290293   3293");
        sb.Append("30334   344348350                                                     ");
        sb.Append("                                                         ");
        sb.Append("                                           ");
        sb.Append("                                                                                   ");
        sb.Append("                                     225020012014   046042044034061");
        sb.Append("075078   361098   131152176160170   208195210   230");
        sb.Append("231260257258271272283306      331332336   3443483");
        sb.Append("54    29                                                           ");
        sb.Append("                                                                      ");
        sb.Append("                                                   2");
        sb.Append("5      29                                                06      0");
        sb.Append("1                                                            178      17");
        sb.Append("4                                                   205                     2");
        sb.Append("05      195                                                   2");
        sb.Append("31                     231      23");
        sb.Append("7                                       01              01    0");
        sb.Append("2                                              260                     26");
        sb.Append("2                                                            274                     2");
        sb.Append("72      274                                       01              01    0");
        sb.Append("3           1   5      3 6     43 52    ");
        return sb.ToString();
    }
}

更新代码以正确地将流读入缓冲区

输出现在看起来像这样:

...
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync PARTIAL #412 - 453 bytes read, offset for next read = 453
ReadAsync #412 - 1602 bytes read
ReadAsync #413 - 2055 bytes read
...


static void Read(MemoryStream memoryStream)
    {
        memoryStream.Position = 0;
        using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
        {
            byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
            int bytesRead; // number of bytes read from Read operation
            int offset = 0; // offset for writing into buffer
            int i = -1; // counter to track iteration #
            while ((bytesRead = deflateStream.Read(buffer, offset, uncompressedSize - offset)) > 0)
            {
                offset += bytesRead;  // offset in buffer for results of next reading
                System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
                if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
                {
                    offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
                    i++; // increment counter that tracks iteration #
                    System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
                }
                else // buffer still not full
                {
                    System.Diagnostics.Debug.WriteLine("Read PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
                }
            }
        }
    }

    static async Task ReadAsync(MemoryStream memoryStream)
    {
        memoryStream.Position = 0;
        using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
        {
            byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
            int bytesRead; // number of bytes read from Read operation
            int offset = 0; // offset for writing into buffer
            int i = -1; // counter to track iteration #
            while ((bytesRead = await deflateStream.ReadAsync(buffer, offset, uncompressedSize - offset)) > 0)
            {
                offset += bytesRead;  // offset in buffer for results of next reading
                System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
                if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
                {
                    offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
                    i++; // increment counter that tracks iteration #
                    System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
                }
                else // buffer still not full
                {
                    System.Diagnostics.Debug.WriteLine("ReadAsync PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
                }
            }
        }
    }

Damien 的评论完全正确。但是,你的错误很常见,恕我直言,这个问题值得一个实际的答案,如果没有其他原因,只是为了帮助犯同样错误的其他人更容易找到问题的答案。

所以,要清楚:

对于 .NET 中所有面向流的 I/O 方法都是如此,其中一个方法提供 byte[] 缓冲区并且读取的字节数由该方法 returned ,您可以对字节数做出的唯一假设是:

  1. 该数字不会大于您要求读取的最大字节数(即作为要读取的字节数传递给方法)
  2. 该数字将是非负数,并且只要实际上还有剩余数据要读取,该数字就会大于 0(到达流末尾时将 returned 为 0) .

当使用这些方法中的任何一种进行阅读时,您甚至不能指望相同的方法总是 returning 相同数量的字节(取决于上下文……显然在某些情况下,这实际上是确定性的,但是你仍然不应该依赖它),并且不能保证不同的方法,即使是那些从同一来源读取的方法,也总是 return 与其他方法相同的字节数。

调用者将字节作为流读取,考虑到指定每次调用读取的字节数的 return 值,并以任何适合的方式重新组装这些字节特定的字节流。

注意,在处理Stream对象时,可以使用Stream.CopyTo()方法。当然,它只是复制到另一个 Stream 对象。但是在很多情况下,可以使用目标对象而无需将其视为 Stream。例如。你只是想把数据写成一个文件,或者你想把它复制到一个 MemoryStream 然后使用 MemoryStream.ToArray() 方法把它变成一个字节数组(然后你可以在没有任何关注在给定的读取操作中读取了多少字节……当您到达数组时,所有字节都已读取:))。