Lua: 如何在内存中压缩一个字符串(gzip,不是 zlib)?

Lua: How to gzip a string (gzip, not zlib) in memory?

给定一个字符串,我如何使用 gzip 将它压缩到内存中?我正在使用 Lua.


这听起来像是一个简单的问题,但是有大量的库。到目前为止,我尝试过的所有方法要么无效,要么只能生成 zlib 压缩字符串。在我的用例中,我需要 gzip 压缩,正如接收者所期望的那样。

作为测试,如果您将压缩后的字符串转储到文件中,zcat 应该能够解压它。

我正在使用 OpenResty,所以任何 Lua 库都应该没问题。

(到目前为止,我唯一可行的解​​决方案是将字符串转储到文件中,调用 os.execute("gzip /tmp/example.txt") 并读回。不幸的是,这不是一个实用的解决方案。)

原来zlib离gzip不远了。不同的是gzip多了一个header.

要获得此 header,您可以像这样使用 lua-zlib

local zlib = require "zlib"

-- input:  string
-- output: string compressed with gzip
function compress(str)
   local level = 5
   local windowSize = 15+16
   return zlib.deflate(level, windowSize)(str, "finish")
end

解释:

  • deflate 的第二个参数是 window 大小。它确保写入 gzip header。如果省略该参数,将得到一个 zlib 压缩字符串。
  • level 是 gzip 压缩级别(1=最差到 9=最好)

这里是 deflate 的文档(来源:lua-zlib documentation):

function stream = zlib.deflate([ int compression_level ], [ int window_size ])

If no compression_level is provided uses Z_DEFAULT_COMPRESSION (6),
compression level is a number from 1-9 where zlib.BEST_SPEED is 1
and zlib.BEST_COMPRESSION is 9.

Returns a "stream" function that compresses (or deflates) all
strings passed in.  Specifically, use it as such:

string deflated, bool eof, int bytes_in, int bytes_out =
        stream(string input [, 'sync' | 'full' | 'finish'])

    Takes input and deflates and returns a portion of it,
    optionally forcing a flush.

    A 'sync' flush will force all pending output to be flushed to
    the return value and the output is aligned on a byte boundary,
    so that the decompressor can get all input data available so
    far.  Flushing may degrade compression for some compression
    algorithms and so it should be used only when necessary.

    A 'full' flush will flush all output as with 'sync', and the
    compression state is reset so that decompression can restart
    from this point if previous compressed data has been damaged
    or if random access is desired. Using Z_FULL_FLUSH too often
    can seriously degrade the compression. 

    A 'finish' flush will force all pending output to be processed
    and results in the stream become unusable.  Any future
    attempts to print anything other than the empty string will
    result in an error that begins with IllegalState.

    The eof result is true if 'finish' was specified, otherwise
    it is false.

    The bytes_in is how many bytes of input have been passed to
    stream, and bytes_out is the number of bytes returned in
    deflated string chunks.