大于 4GB 的输入是否需要使用多个 gzip 成员?

Is it necessary to use multiple gzip members for input larger than 4GB?

来自 stating

Features:

  • no 4GB limit

...

Idzip just uses multiple gzip members to have no file size limit.

the author of idzip 似乎暗示需要多个 gzip 成员才能支持 > 4GB 的数据。

但是deflate algorithm, whose output gzip members merely wrap with header and footer, evidently支持超过4GB的输入。

那么压缩超过4GB的数据真的有必要使用多个gzip成员吗?

即使是 .net 的 GZipStream,它不支持多个成员(与 spec btw), nevertheless supports gzip files with more 4GB, now that (since .net 4.0 相反),底层 DeflateStream 支持它。

这样可以密封它:输入大于 4GB 时不需要多个 gzip 成员。

gzip specs也不限制大小:

  Each member has the following structure:

     +---+---+---+---+---+---+---+---+---+---+
     |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
     +---+---+---+---+---+---+---+---+---+---+

... [omitting optional headers]

     +=======================+
     |...compressed blocks...| (more-->)
     +=======================+

       0   1   2   3   4   5   6   7
     +---+---+---+---+---+---+---+---+
     |     CRC32     |     ISIZE     |
     +---+---+---+---+---+---+---+---+

     ISIZE (Input SIZE)
        This contains the size of the original (uncompressed) input
        data modulo 2^32.

这里的重点是

size of original (uncompressed) input data modulo 2^32.