在 Clojure 中解压 zlib 流
Decompress zlib stream in Clojure
我有一个二进制文件,其中包含 zlib.compress
在 Python 上创建的内容,有没有一种简单的方法可以在 Clojure 中打开和解压缩它?
import zlib
import json
with open('data.json.zlib', 'wb') as f:
f.write(zlib.compress(json.dumps(data).encode('utf-8')))
基本上它不是 gzip 文件,它只是代表 deflated 数据的字节。
我只能找到这些参考资料,但不是我要找的东西(我认为前两个最相关):
- deflateclj_hatemogi_clojure/deflate.clj
- funcool/buddy-core/deflate.clj
- Compressing / Decompressing strings in clojure
- Reading and Writing Compressed Files
- clj-http
我真的必须为 java.util.zip
实现这个多行包装器吗?还是那里有一个不错的库?实际上我什至不确定这些字节流是否跨库兼容,或者我是否只是想混合和匹配错误的库。
Python中的步骤:
>>> '{"hello": "world"}'.encode('utf-8')
b'{"hello": "world"}'
>>> zlib.compress(b'{"hello": "world"}')
b'x\x9c\xabV\xcaH\xcd\xc9\xc9W\xb2RP*\xcf/\xcaIQ\xaa\x05\x009\x99\x06\x17'
>>> [int(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23]
>>> import numpy
>>> [numpy.int8(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]
>>> zlib.decompress(bytes([120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23])).decode('utf-8')
'{"hello": "world"}'
Clojure 中的解码尝试:
; https://github.com/funcool/buddy-core/blob/master/src/buddy/util/deflate.clj#L40 without try-catch
(ns so.core
(:import java.io.ByteArrayInputStream
java.io.ByteArrayOutputStream
java.util.zip.Deflater
java.util.zip.DeflaterOutputStream
java.util.zip.InflaterInputStream
java.util.zip.Inflater
java.util.zip.ZipException)
(:gen-class))
(defn uncompress
"Given a compressed data as byte-array, uncompress it and return as an other byte array."
([^bytes input] (uncompress input nil))
([^bytes input {:keys [nowrap buffer-size]
:or {nowrap true buffer-size 2048}
:as opts}]
(let [buf (byte-array (int buffer-size))
os (ByteArrayOutputStream.)
inf (Inflater. ^Boolean nowrap)]
(with-open [is (ByteArrayInputStream. input)
iis (InflaterInputStream. is inf)]
(loop []
(let [readed (.read iis buf)]
(when (pos? readed)
(.write os buf 0 readed)
(recur)))))
(.toByteArray os))))
(uncompress (byte-array [120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]))
ZipException invalid stored block lengths java.util.zip.InflaterInputStream.read (InflaterInputStream.java:164)
如有任何帮助,我们将不胜感激。我不想使用 zip 或 gzip 文件,因为我只关心原始内容,而不关心文件名或修改日期。但如果它是唯一的选择,则可以在 Python 端使用其他压缩算法。
这里有一个使用 gzip 的简单方法:
Python代码:
import gzip
content = "the quick brown fox"
with gzip.open('fox.txt.gz', 'wb') as f:
f.write(content)
Clojure 代码:
(with-open [in (java.util.zip.GZIPInputStream.
(clojure.java.io/input-stream
"fox.txt.gz"))]
(println "result:" (slurp in)))
;=> result: the quick brown fox
请记住,"gzip" 是一种算法和格式,并不意味着您需要使用 "gzip" command-line 工具。
请注意,Clojure 的输入不一定是文件。您可以通过套接字将 gzip 压缩数据作为原始字节发送,并仍然在 Clojure 端对其进行解压缩。完整详情见:https://clojuredocs.org/clojure.java.io/input-stream
更新
如果您需要使用纯 zlib
格式而不是 gzip
,结果非常相似:
Python代码:
import zlib
fp = open( 'balloon.txt.z', 'wb' )
fp.write( zlib.compress( 'the big red baloon' ))
fp.close()
Clojure 代码:
(with-open [in (java.util.zip.InflaterInputStream.
(clojure.java.io/input-stream
"balloon.txt.z"))]
(println "result:" (slurp in)))
;=> result: the big red baloon
我有一个二进制文件,其中包含 zlib.compress
在 Python 上创建的内容,有没有一种简单的方法可以在 Clojure 中打开和解压缩它?
import zlib
import json
with open('data.json.zlib', 'wb') as f:
f.write(zlib.compress(json.dumps(data).encode('utf-8')))
基本上它不是 gzip 文件,它只是代表 deflated 数据的字节。
我只能找到这些参考资料,但不是我要找的东西(我认为前两个最相关):
- deflateclj_hatemogi_clojure/deflate.clj
- funcool/buddy-core/deflate.clj
- Compressing / Decompressing strings in clojure
- Reading and Writing Compressed Files
- clj-http
我真的必须为 java.util.zip
实现这个多行包装器吗?还是那里有一个不错的库?实际上我什至不确定这些字节流是否跨库兼容,或者我是否只是想混合和匹配错误的库。
Python中的步骤:
>>> '{"hello": "world"}'.encode('utf-8')
b'{"hello": "world"}'
>>> zlib.compress(b'{"hello": "world"}')
b'x\x9c\xabV\xcaH\xcd\xc9\xc9W\xb2RP*\xcf/\xcaIQ\xaa\x05\x009\x99\x06\x17'
>>> [int(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23]
>>> import numpy
>>> [numpy.int8(i) for i in zlib.compress(b'{"hello": "world"}')]
[120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]
>>> zlib.decompress(bytes([120, 156, 171, 86, 202, 72, 205, 201, 201, 87, 178, 82, 80, 42, 207, 47, 202, 73, 81, 170, 5, 0, 57, 153, 6, 23])).decode('utf-8')
'{"hello": "world"}'
Clojure 中的解码尝试:
; https://github.com/funcool/buddy-core/blob/master/src/buddy/util/deflate.clj#L40 without try-catch
(ns so.core
(:import java.io.ByteArrayInputStream
java.io.ByteArrayOutputStream
java.util.zip.Deflater
java.util.zip.DeflaterOutputStream
java.util.zip.InflaterInputStream
java.util.zip.Inflater
java.util.zip.ZipException)
(:gen-class))
(defn uncompress
"Given a compressed data as byte-array, uncompress it and return as an other byte array."
([^bytes input] (uncompress input nil))
([^bytes input {:keys [nowrap buffer-size]
:or {nowrap true buffer-size 2048}
:as opts}]
(let [buf (byte-array (int buffer-size))
os (ByteArrayOutputStream.)
inf (Inflater. ^Boolean nowrap)]
(with-open [is (ByteArrayInputStream. input)
iis (InflaterInputStream. is inf)]
(loop []
(let [readed (.read iis buf)]
(when (pos? readed)
(.write os buf 0 readed)
(recur)))))
(.toByteArray os))))
(uncompress (byte-array [120, -100, -85, 86, -54, 72, -51, -55, -55, 87, -78, 82, 80, 42, -49, 47, -54, 73, 81, -86, 5, 0, 57, -103, 6, 23]))
ZipException invalid stored block lengths java.util.zip.InflaterInputStream.read (InflaterInputStream.java:164)
如有任何帮助,我们将不胜感激。我不想使用 zip 或 gzip 文件,因为我只关心原始内容,而不关心文件名或修改日期。但如果它是唯一的选择,则可以在 Python 端使用其他压缩算法。
这里有一个使用 gzip 的简单方法:
Python代码:
import gzip
content = "the quick brown fox"
with gzip.open('fox.txt.gz', 'wb') as f:
f.write(content)
Clojure 代码:
(with-open [in (java.util.zip.GZIPInputStream.
(clojure.java.io/input-stream
"fox.txt.gz"))]
(println "result:" (slurp in)))
;=> result: the quick brown fox
请记住,"gzip" 是一种算法和格式,并不意味着您需要使用 "gzip" command-line 工具。
请注意,Clojure 的输入不一定是文件。您可以通过套接字将 gzip 压缩数据作为原始字节发送,并仍然在 Clojure 端对其进行解压缩。完整详情见:https://clojuredocs.org/clojure.java.io/input-stream
更新
如果您需要使用纯 zlib
格式而不是 gzip
,结果非常相似:
Python代码:
import zlib
fp = open( 'balloon.txt.z', 'wb' )
fp.write( zlib.compress( 'the big red baloon' ))
fp.close()
Clojure 代码:
(with-open [in (java.util.zip.InflaterInputStream.
(clojure.java.io/input-stream
"balloon.txt.z"))]
(println "result:" (slurp in)))
;=> result: the big red baloon