DATA_ERROR 读取 zlib/miniz 缩减数据时
DATA_ERROR when reading zlib/miniz deflated data
我正在为 miniz-cpp 的 zlib 压缩实现编写一个简单的 C++ 包装器。我开始通货紧缩,但现在我再次膨胀数据时遇到问题。
代码
我有一个测试用例(大大简化)归结为:
ByteArray randomData = createRandomData(1024 * 1024);
ByteArray deflatedBytes = deflate(randomData);
writeToTmpFile(deflatedBytes); // for manual review
ByteArray inflatedBytes = inflate(deflatedBytes);
assert(randomData == inflatedBytes);
我卡在了一个 DATA_ERROR (-3)
上,当我再次膨胀我的数据时它会返回。
这是出现问题的函数:
// inflates the next <size> bytes and stores them in <out[]>
// stores the actually written amount in <written>
ResultCode Inflator::inflate(uint8_t out[], size_t size, size_t& written)
{
zStream.next_out = out;
zStream.avail_out = static_cast<unsigned int>(size);
// loop until output buffer is completely filled
while (zStream.avail_out != 0) {
if (zStream.avail_in == 0) {
// our Inflator stores a ByteArrayInputStream from which
// we request more data
size_t read = iStream.read(in, BUFFER_SIZE);
if (iStream.err()) {
return ResultCode::STREAM_ERROR;
}
zStream.next_in = in;
zStream.avail_in = static_cast<unsigned int>(read);
}
// THIS IS WHERE WE ACTUALLY CALL INFLATE.
// RESULT CODE -3 (DATA_ERROR) IS RETURNED AFTER READING
// ONLY 13 BYTES.
ResultCode result{mz_inflate(&zStream, Flushing::NONE)};
if (result == ResultCode::STREAM_END) {
written = size - zStream.avail_out;
this->eof_ = true;
return ResultCode::OK;
}
else if (result != ResultCode::OK) {
return result;
}
}
written = size - zStream.avail_out;
return ResultCode::OK;
}
我的数据
我已经在调试器中验证了我读取的数据是正确的:
您可以看到 zStream
即 mz_stream
中的数据 next_in
是有效的 zlib 编码数据。至少它以 0x78
开头。
正如我在伪代码中提到的,我还将数据转储到磁盘。使用以下方法可以很好地读取此数据:
# this command is included in the qpdf package and uncompresses zlib streams
zlib-flate -uncompress < 'mve_deflOutput.zlib' > 'mve_deflOutput.bin'
这也是第一个字节的十六进制转储:
00000000: 7801 a4dd fb7f cfe5 1b07 7072 c8a9 9632 x.........pr...2
00000010: 49ac 9043 9a4c 392c 462c 4d88 8ab0 4a7c I..C.L9,F,M...J|
00000020: 55d6 2c6d 6921 0931 34ad 4db5 8898 1c26 U.,mi!.14.M....&
00000030: 3a88 ce69 51d9 6a52 94a8 302d d252 6ba6 :..iQ.jR..0-.Rk.
00000040: 84a2 b2ef 9ff0 fce1 be7f dd63 dbe7 f37e ...........c...~
00000050: dff7 75bd aed7 eb75 5df7 11ac 4358 3763 ..u....u]...CX7c
00000060: b5c0 ea88 550f ab33 d62e aca7 b132 b116 ....U..3.....2..
00000070: 611d c73a 8935 096b 0d56 05d6 8758 a3b1 a..:.5.k.V...X..
00000080: f0ef d7b4 c5c2 bfff f027 2c3c be93 b5b1 .........',<....
00000090: cec2 1a80 7531 5612 d68d 583b b0d6 622d ....u1V...X;..b-
错误
无论出于何种原因,对模拟 zlib 的 inflate
returns DATA_ERROR (-3)
的 mz_inflate
的调用。 zStream
中的 total_in
字段设置为 13,因此看起来在错误发生之前只读取了 13 个字节。
总结一下:如果压缩后的数据没问题,可以用zlib-flate
提取出来,那为什么miniz不能读取这个数据呢?它实际上是自己写的。如果前 13 个字节有问题,我看不出是什么问题。
For reference, here is the full code of the Inflator
and the test.
我正在为 miniz-cpp 的 zlib 压缩实现编写一个简单的 C++ 包装器。我开始通货紧缩,但现在我再次膨胀数据时遇到问题。
代码
我有一个测试用例(大大简化)归结为:
ByteArray randomData = createRandomData(1024 * 1024);
ByteArray deflatedBytes = deflate(randomData);
writeToTmpFile(deflatedBytes); // for manual review
ByteArray inflatedBytes = inflate(deflatedBytes);
assert(randomData == inflatedBytes);
我卡在了一个 DATA_ERROR (-3)
上,当我再次膨胀我的数据时它会返回。
这是出现问题的函数:
// inflates the next <size> bytes and stores them in <out[]>
// stores the actually written amount in <written>
ResultCode Inflator::inflate(uint8_t out[], size_t size, size_t& written)
{
zStream.next_out = out;
zStream.avail_out = static_cast<unsigned int>(size);
// loop until output buffer is completely filled
while (zStream.avail_out != 0) {
if (zStream.avail_in == 0) {
// our Inflator stores a ByteArrayInputStream from which
// we request more data
size_t read = iStream.read(in, BUFFER_SIZE);
if (iStream.err()) {
return ResultCode::STREAM_ERROR;
}
zStream.next_in = in;
zStream.avail_in = static_cast<unsigned int>(read);
}
// THIS IS WHERE WE ACTUALLY CALL INFLATE.
// RESULT CODE -3 (DATA_ERROR) IS RETURNED AFTER READING
// ONLY 13 BYTES.
ResultCode result{mz_inflate(&zStream, Flushing::NONE)};
if (result == ResultCode::STREAM_END) {
written = size - zStream.avail_out;
this->eof_ = true;
return ResultCode::OK;
}
else if (result != ResultCode::OK) {
return result;
}
}
written = size - zStream.avail_out;
return ResultCode::OK;
}
我的数据
我已经在调试器中验证了我读取的数据是正确的:
zStream
即 mz_stream
中的数据 next_in
是有效的 zlib 编码数据。至少它以 0x78
开头。
正如我在伪代码中提到的,我还将数据转储到磁盘。使用以下方法可以很好地读取此数据:
# this command is included in the qpdf package and uncompresses zlib streams
zlib-flate -uncompress < 'mve_deflOutput.zlib' > 'mve_deflOutput.bin'
这也是第一个字节的十六进制转储:
00000000: 7801 a4dd fb7f cfe5 1b07 7072 c8a9 9632 x.........pr...2
00000010: 49ac 9043 9a4c 392c 462c 4d88 8ab0 4a7c I..C.L9,F,M...J|
00000020: 55d6 2c6d 6921 0931 34ad 4db5 8898 1c26 U.,mi!.14.M....&
00000030: 3a88 ce69 51d9 6a52 94a8 302d d252 6ba6 :..iQ.jR..0-.Rk.
00000040: 84a2 b2ef 9ff0 fce1 be7f dd63 dbe7 f37e ...........c...~
00000050: dff7 75bd aed7 eb75 5df7 11ac 4358 3763 ..u....u]...CX7c
00000060: b5c0 ea88 550f ab33 d62e aca7 b132 b116 ....U..3.....2..
00000070: 611d c73a 8935 096b 0d56 05d6 8758 a3b1 a..:.5.k.V...X..
00000080: f0ef d7b4 c5c2 bfff f027 2c3c be93 b5b1 .........',<....
00000090: cec2 1a80 7531 5612 d68d 583b b0d6 622d ....u1V...X;..b-
错误
无论出于何种原因,对模拟 zlib 的 inflate
returns DATA_ERROR (-3)
的 mz_inflate
的调用。 zStream
中的 total_in
字段设置为 13,因此看起来在错误发生之前只读取了 13 个字节。
总结一下:如果压缩后的数据没问题,可以用zlib-flate
提取出来,那为什么miniz不能读取这个数据呢?它实际上是自己写的。如果前 13 个字节有问题,我看不出是什么问题。
For reference, here is the full code of the Inflator
and the test.