存在哪些 GZip 额外字段子字段?
What GZip extra field subfields exist?
RFC 1952(GZIP 文件格式规范)section 2.3.1.1 读取:
2.3.1.1. Extra field
If the FLG.FEXTRA bit is set, an "extra field" is present in
the header, with total length XLEN bytes. It consists of a
series of subfields, each of the form:
+---+---+---+---+==================================+
|SI1|SI2| LEN |... LEN bytes of subfield data ...|
+---+---+---+---+==================================+
SI1 and SI2 provide a subfield ID, typically two ASCII letters
with some mnemonic value. Jean-Loup Gailly
<email@hidden> is maintaining a registry of subfield
IDs; please send him any subfield ID you wish to use. Subfield
IDs with SI2 = 0 are reserved for future use. The following
IDs are currently defined:
SI1 SI2 Data
---------- ---------- ----
0x41 ('A') 0x70 ('P') Apollo file type information
LEN gives the length of the subfield data, excluding the 4
initial bytes.
除了 RFC 中给出的 AP
之外,是否存在任何子字段类型?网络搜索找不到列表; GZip 的维基百科页面、GNU 主页、gzip 源代码或 Stack Overflow 上也没有任何提及。
据我所知,没有维护这样的注册表。 Jean-loup 不再适用于 gzip。
这里还有一个正在使用的子字段:
为在生物信息学中使用而开发的 BGZF 格式(符合 gzip 格式)使用子字段类型“BC”来指示当前块的大小。这用于使并行解压缩变得容易。
来自 http://samtools.github.io/hts-specs/SAMv1.pdf 的规范:
Each BGZF block contains a standard gzip file header with the following standard-compliant extensions:
- The F.EXTRA bit in the header is set to indicate that extra fields are present.
- The extra field used by BGZF uses the two subfield ID values 66 and 67 (ASCII ‘BC’).
- The length of the BGZF extra field payload (field LEN in the gzip specification) is 2 (two bytes of
payload).
- The payload of the BGZF extra field is a 16-bit unsigned integer in little endian format. This integer
gives the size of the containing BGZF block minus one.
RFC 1952(GZIP 文件格式规范)section 2.3.1.1 读取:
2.3.1.1. Extra field
If the FLG.FEXTRA bit is set, an "extra field" is present in
the header, with total length XLEN bytes. It consists of a
series of subfields, each of the form:
+---+---+---+---+==================================+
|SI1|SI2| LEN |... LEN bytes of subfield data ...|
+---+---+---+---+==================================+
SI1 and SI2 provide a subfield ID, typically two ASCII letters
with some mnemonic value. Jean-Loup Gailly
<email@hidden> is maintaining a registry of subfield
IDs; please send him any subfield ID you wish to use. Subfield
IDs with SI2 = 0 are reserved for future use. The following
IDs are currently defined:
SI1 SI2 Data
---------- ---------- ----
0x41 ('A') 0x70 ('P') Apollo file type information
LEN gives the length of the subfield data, excluding the 4
initial bytes.
除了 RFC 中给出的 AP
之外,是否存在任何子字段类型?网络搜索找不到列表; GZip 的维基百科页面、GNU 主页、gzip 源代码或 Stack Overflow 上也没有任何提及。
据我所知,没有维护这样的注册表。 Jean-loup 不再适用于 gzip。
这里还有一个正在使用的子字段:
为在生物信息学中使用而开发的 BGZF 格式(符合 gzip 格式)使用子字段类型“BC”来指示当前块的大小。这用于使并行解压缩变得容易。
来自 http://samtools.github.io/hts-specs/SAMv1.pdf 的规范:
Each BGZF block contains a standard gzip file header with the following standard-compliant extensions:
- The F.EXTRA bit in the header is set to indicate that extra fields are present.
- The extra field used by BGZF uses the two subfield ID values 66 and 67 (ASCII ‘BC’).
- The length of the BGZF extra field payload (field LEN in the gzip specification) is 2 (two bytes of payload).
- The payload of the BGZF extra field is a 16-bit unsigned integer in little endian format. This integer gives the size of the containing BGZF block minus one.