在 Java 中压缩,在 Python 中解压缩 - snappy/redis-py-cluster
Compress in Java, decompress in Python - snappy/redis-py-cluster
我正在 python 中为 redis 集群 编写 cron 脚本,并使用 redis-py-cluster 仅从生产服务器读取数据。一个单独的 Java 应用程序正在使用 snappy 压缩和 java 字符串编解码器 utf-8 写入 redis 集群。
我可以读取数据但无法解码。
from rediscluster import RedisCluster
import snappy
host, port ="127.0.0.1", "30001"
startup_nodes = [{"host": host, "port": port}]
print("Trying connecting to redis cluster host=" + host + ", port=" + str(port))
rc = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)
print("Connected", rc)
print("Reading all keys, value ...\n\n")
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#uncompress = snappy.uncompress(value, decoding="utf-8")
print(key, value)
print('\n')
print("Done. exit()")
exit()
decode_responses=False
可以很好地处理评论。然而改变 decode_responses=True
是抛出错误。我的猜测是它无法获得正确的解码器。
Traceback (most recent call last):
File "splooks_cron.py", line 22, in <module>
print(key, rc.get(key))
File "/Library/Python/2.7/site-packages/redis/client.py", line 1207, in get
return self.execute_command('GET', name)
File "/Library/Python/2.7/site-packages/rediscluster/utils.py", line 101, in inner
return func(*args, **kwargs)
File "/Library/Python/2.7/site-packages/rediscluster/client.py", line 410, in execute_command
return self.parse_response(r, command, **kwargs)
File "/Library/Python/2.7/site-packages/redis/client.py", line 768, in parse_response
response = connection.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 636, in read_response
raise e
: 'utf8' codec can't decode byte 0x82 in position 0: invalid start byte
PS:取消注释此行 uncompress = snappy.uncompress(value, decoding="utf-8")
因错误
而中断
Traceback (most recent call last):
File "splooks_cron.py", line 27, in <module>
uncompress = snappy.uncompress(value, decoding="utf-8")
File "/Library/Python/2.7/site-packages/snappy/snappy.py", line 91, in uncompress
return _uncompress(data).decode(decoding)
snappy.UncompressError: Error while decompressing: invalid input
经过几个小时的调试,我终于能够解决这个问题。
我在写入 redis 集群 的 Java 代码中使用 xerial/snappy-java 压缩器。有趣的是,在压缩过程中 xerial SnappyOutputStream
在压缩数据的开头添加了一些 offset 。在我的例子中,它看起来像这样
"\x82SNAPPY\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x01\xb6\x8b\x06\******actual data here*****
因此,解压缩程序无法弄清楚。我修改了如下代码并从值中删除了 offset。现在一切正常。
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#in my case offset was 20 and utf-8 is default ecoder/decoder for snappy
# https://github.com/andrix/python-snappy/blob/master/snappy/snappy.py
uncompress_value = snappy.decompress(value[20:])
print(key, uncompress_value)
print('\n')
我正在 python 中为 redis 集群 编写 cron 脚本,并使用 redis-py-cluster 仅从生产服务器读取数据。一个单独的 Java 应用程序正在使用 snappy 压缩和 java 字符串编解码器 utf-8 写入 redis 集群。
我可以读取数据但无法解码。
from rediscluster import RedisCluster
import snappy
host, port ="127.0.0.1", "30001"
startup_nodes = [{"host": host, "port": port}]
print("Trying connecting to redis cluster host=" + host + ", port=" + str(port))
rc = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)
print("Connected", rc)
print("Reading all keys, value ...\n\n")
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#uncompress = snappy.uncompress(value, decoding="utf-8")
print(key, value)
print('\n')
print("Done. exit()")
exit()
decode_responses=False
可以很好地处理评论。然而改变 decode_responses=True
是抛出错误。我的猜测是它无法获得正确的解码器。
Traceback (most recent call last):
File "splooks_cron.py", line 22, in <module>
print(key, rc.get(key))
File "/Library/Python/2.7/site-packages/redis/client.py", line 1207, in get
return self.execute_command('GET', name)
File "/Library/Python/2.7/site-packages/rediscluster/utils.py", line 101, in inner
return func(*args, **kwargs)
File "/Library/Python/2.7/site-packages/rediscluster/client.py", line 410, in execute_command
return self.parse_response(r, command, **kwargs)
File "/Library/Python/2.7/site-packages/redis/client.py", line 768, in parse_response
response = connection.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 636, in read_response
raise e
: 'utf8' codec can't decode byte 0x82 in position 0: invalid start byte
PS:取消注释此行 uncompress = snappy.uncompress(value, decoding="utf-8")
因错误
Traceback (most recent call last):
File "splooks_cron.py", line 27, in <module>
uncompress = snappy.uncompress(value, decoding="utf-8")
File "/Library/Python/2.7/site-packages/snappy/snappy.py", line 91, in uncompress
return _uncompress(data).decode(decoding)
snappy.UncompressError: Error while decompressing: invalid input
经过几个小时的调试,我终于能够解决这个问题。
我在写入 redis 集群 的 Java 代码中使用 xerial/snappy-java 压缩器。有趣的是,在压缩过程中 xerial SnappyOutputStream
在压缩数据的开头添加了一些 offset 。在我的例子中,它看起来像这样
"\x82SNAPPY\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x01\xb6\x8b\x06\******actual data here*****
因此,解压缩程序无法弄清楚。我修改了如下代码并从值中删除了 offset。现在一切正常。
for key in rc.scan_iter("uidx:*"):
value = rc.get(key)
#in my case offset was 20 and utf-8 is default ecoder/decoder for snappy
# https://github.com/andrix/python-snappy/blob/master/snappy/snappy.py
uncompress_value = snappy.decompress(value[20:])
print(key, uncompress_value)
print('\n')