DeflateStream.ReadAsync (.NET 4.5 System.IO.Compression) 读取字节的 return 值与等效读取方法不同?
DeflateStream.ReadAsync (.NET 4.5 System.IO.Compression) has different return value of bytes read than equivalent Read method?
在将一些较旧的代码转换为在 c# 中使用异步时,我开始发现来自 DeflateStream 的 Read() 和 ReadAsync() 方法的 return 值变体存在问题。
我认为同步代码的转换像
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
相当于
的异步版本
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
应该始终return相同的值。
查看添加到问题底部的更新代码 - 以正确的方式使用流 - 因此使初始问题变得无关紧要
我发现在多次迭代后这并不成立,在我的特定情况下导致转换后的应用程序出现随机错误。
我是不是漏掉了什么?
下面是简单的重现案例(在控制台应用程序中),其中 Assert
将在迭代 #412 的 ReadAsync
方法中为我中断,给出如下所示的输出:
....
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync #412 - 453 bytes read
---- DEBUG ASSERTION FAILED ----
我的问题是,为什么 DeflateStream.ReadAsync
方法此时 return 占用 453 个字节?
注意:这只发生在某些输入字符串上 - CreateProblemDataString
中的大量 StringBuilder
内容是我能想到的为此 post 构建字符串的最佳方式。
class Program
{
static byte[] DataAsByteArray;
static int uncompressedSize;
static void Main(string[] args)
{
string problemDataString = CreateProblemDataString();
DataAsByteArray = Encoding.ASCII.GetBytes(problemDataString);
uncompressedSize = DataAsByteArray.Length;
MemoryStream memoryStream = new MemoryStream();
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress, true))
{
for (int i = 0; i < 1000; i++)
{
deflateStream.Write(DataAsByteArray, 0, uncompressedSize);
}
}
// now read it back synchronously
Read(memoryStream);
// now read it back asynchronously
Task retval = ReadAsync(memoryStream);
retval.Wait();
}
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
/// <summary>
/// This is one of the strings of data that was causing issues.
/// </summary>
/// <returns></returns>
static string CreateProblemDataString()
{
StringBuilder sb = new StringBuilder();
sb.Append("0601051081 ");
sb.Append(" ");
sb.Append(" 225021 0300420");
sb.Append("34056064070072076361102 13115016017");
sb.Append("5 192 230237260250 2722");
sb.Append("73280296 326329332 34535535");
sb.Append("7 3 ");
sb.Append(" 4");
sb.Append(" ");
sb.Append(" 50");
sb.Append("6020009 030034045 063071076 360102 13");
sb.Append("1152176160170 208206 23023726025825027227328");
sb.Append("2283285 320321333335341355357 622005009 0");
sb.Append("34053 060070 361096 130151176174178172208");
sb.Append("210198 235237257258256275276280290293 3293");
sb.Append("30334 344348350 ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" 225020012014 046042044034061");
sb.Append("075078 361098 131152176160170 208195210 230");
sb.Append("231260257258271272283306 331332336 3443483");
sb.Append("54 29 ");
sb.Append(" ");
sb.Append(" 2");
sb.Append("5 29 06 0");
sb.Append("1 178 17");
sb.Append("4 205 2");
sb.Append("05 195 2");
sb.Append("31 231 23");
sb.Append("7 01 01 0");
sb.Append("2 260 26");
sb.Append("2 274 2");
sb.Append("72 274 01 01 0");
sb.Append("3 1 5 3 6 43 52 ");
return sb.ToString();
}
}
更新代码以正确地将流读入缓冲区
输出现在看起来像这样:
...
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync PARTIAL #412 - 453 bytes read, offset for next read = 453
ReadAsync #412 - 1602 bytes read
ReadAsync #413 - 2055 bytes read
...
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = deflateStream.Read(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("Read PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = await deflateStream.ReadAsync(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("ReadAsync PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
Damien 的评论完全正确。但是,你的错误很常见,恕我直言,这个问题值得一个实际的答案,如果没有其他原因,只是为了帮助犯同样错误的其他人更容易找到问题的答案。
所以,要清楚:
对于 .NET 中所有面向流的 I/O 方法都是如此,其中一个方法提供 byte[]
缓冲区并且读取的字节数由该方法 returned ,您可以对字节数做出的唯一假设是:
- 该数字不会大于您要求读取的最大字节数(即作为要读取的字节数传递给方法)
- 该数字将是非负数,并且只要实际上还有剩余数据要读取,该数字就会大于 0(到达流末尾时将 returned 为 0) .
当使用这些方法中的任何一种进行阅读时,您甚至不能指望相同的方法总是 returning 相同数量的字节(取决于上下文……显然在某些情况下,这实际上是确定性的,但是你仍然不应该依赖它),并且不能保证不同的方法,即使是那些从同一来源读取的方法,也总是 return 与其他方法相同的字节数。
调用者将字节作为流读取,考虑到指定每次调用读取的字节数的 return 值,并以任何适合的方式重新组装这些字节特定的字节流。
注意,在处理Stream
对象时,可以使用Stream.CopyTo()
方法。当然,它只是复制到另一个 Stream
对象。但是在很多情况下,可以使用目标对象而无需将其视为 Stream
。例如。你只是想把数据写成一个文件,或者你想把它复制到一个 MemoryStream
然后使用 MemoryStream.ToArray()
方法把它变成一个字节数组(然后你可以在没有任何关注在给定的读取操作中读取了多少字节……当您到达数组时,所有字节都已读取:))。
在将一些较旧的代码转换为在 c# 中使用异步时,我开始发现来自 DeflateStream 的 Read() 和 ReadAsync() 方法的 return 值变体存在问题。
我认为同步代码的转换像
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
相当于
的异步版本bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
应该始终return相同的值。
查看添加到问题底部的更新代码 - 以正确的方式使用流 - 因此使初始问题变得无关紧要
我发现在多次迭代后这并不成立,在我的特定情况下导致转换后的应用程序出现随机错误。
我是不是漏掉了什么?
下面是简单的重现案例(在控制台应用程序中),其中 Assert
将在迭代 #412 的 ReadAsync
方法中为我中断,给出如下所示的输出:
....
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync #412 - 453 bytes read
---- DEBUG ASSERTION FAILED ----
我的问题是,为什么 DeflateStream.ReadAsync
方法此时 return 占用 453 个字节?
注意:这只发生在某些输入字符串上 - CreateProblemDataString
中的大量 StringBuilder
内容是我能想到的为此 post 构建字符串的最佳方式。
class Program
{
static byte[] DataAsByteArray;
static int uncompressedSize;
static void Main(string[] args)
{
string problemDataString = CreateProblemDataString();
DataAsByteArray = Encoding.ASCII.GetBytes(problemDataString);
uncompressedSize = DataAsByteArray.Length;
MemoryStream memoryStream = new MemoryStream();
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress, true))
{
for (int i = 0; i < 1000; i++)
{
deflateStream.Write(DataAsByteArray, 0, uncompressedSize);
}
}
// now read it back synchronously
Read(memoryStream);
// now read it back asynchronously
Task retval = ReadAsync(memoryStream);
retval.Wait();
}
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
/// <summary>
/// This is one of the strings of data that was causing issues.
/// </summary>
/// <returns></returns>
static string CreateProblemDataString()
{
StringBuilder sb = new StringBuilder();
sb.Append("0601051081 ");
sb.Append(" ");
sb.Append(" 225021 0300420");
sb.Append("34056064070072076361102 13115016017");
sb.Append("5 192 230237260250 2722");
sb.Append("73280296 326329332 34535535");
sb.Append("7 3 ");
sb.Append(" 4");
sb.Append(" ");
sb.Append(" 50");
sb.Append("6020009 030034045 063071076 360102 13");
sb.Append("1152176160170 208206 23023726025825027227328");
sb.Append("2283285 320321333335341355357 622005009 0");
sb.Append("34053 060070 361096 130151176174178172208");
sb.Append("210198 235237257258256275276280290293 3293");
sb.Append("30334 344348350 ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" 225020012014 046042044034061");
sb.Append("075078 361098 131152176160170 208195210 230");
sb.Append("231260257258271272283306 331332336 3443483");
sb.Append("54 29 ");
sb.Append(" ");
sb.Append(" 2");
sb.Append("5 29 06 0");
sb.Append("1 178 17");
sb.Append("4 205 2");
sb.Append("05 195 2");
sb.Append("31 231 23");
sb.Append("7 01 01 0");
sb.Append("2 260 26");
sb.Append("2 274 2");
sb.Append("72 274 01 01 0");
sb.Append("3 1 5 3 6 43 52 ");
return sb.ToString();
}
}
更新代码以正确地将流读入缓冲区
输出现在看起来像这样:
...
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync PARTIAL #412 - 453 bytes read, offset for next read = 453
ReadAsync #412 - 1602 bytes read
ReadAsync #413 - 2055 bytes read
...
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = deflateStream.Read(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("Read PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = await deflateStream.ReadAsync(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("ReadAsync PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
Damien 的评论完全正确。但是,你的错误很常见,恕我直言,这个问题值得一个实际的答案,如果没有其他原因,只是为了帮助犯同样错误的其他人更容易找到问题的答案。
所以,要清楚:
对于 .NET 中所有面向流的 I/O 方法都是如此,其中一个方法提供 byte[]
缓冲区并且读取的字节数由该方法 returned ,您可以对字节数做出的唯一假设是:
- 该数字不会大于您要求读取的最大字节数(即作为要读取的字节数传递给方法)
- 该数字将是非负数,并且只要实际上还有剩余数据要读取,该数字就会大于 0(到达流末尾时将 returned 为 0) .
当使用这些方法中的任何一种进行阅读时,您甚至不能指望相同的方法总是 returning 相同数量的字节(取决于上下文……显然在某些情况下,这实际上是确定性的,但是你仍然不应该依赖它),并且不能保证不同的方法,即使是那些从同一来源读取的方法,也总是 return 与其他方法相同的字节数。
调用者将字节作为流读取,考虑到指定每次调用读取的字节数的 return 值,并以任何适合的方式重新组装这些字节特定的字节流。
注意,在处理Stream
对象时,可以使用Stream.CopyTo()
方法。当然,它只是复制到另一个 Stream
对象。但是在很多情况下,可以使用目标对象而无需将其视为 Stream
。例如。你只是想把数据写成一个文件,或者你想把它复制到一个 MemoryStream
然后使用 MemoryStream.ToArray()
方法把它变成一个字节数组(然后你可以在没有任何关注在给定的读取操作中读取了多少字节……当您到达数组时,所有字节都已读取:))。