Django:以压缩二进制格式存储字典
Django: Store a dictionary in a compressed binary format
我正在尝试在 PostgreSQL 中以压缩二进制形式存储字典。原因是这个字典对象有一天可能会变得太大(数亿个值)。
我找到了这个片段:https://djangosnippets.org/snippets/2014/ 并将其更改为:
class CompressedBinaryField(models.BinaryField):
__metaclass__ = models.SubfieldBase
def to_python(self, value):
if not value:
return value
try:
return value.decode('bz2').decode('utf-8')
except Exception:
return value
def get_prep_value(self, value):
if not value:
return value
try:
value.decode('bz2')
return value
except Exception:
try:
tmp = value.encode('utf-8').encode('bz2')
except Exception:
return value
else:
if len(tmp) > len(value):
return value
return tmp
我正在用这个命令存储字典:
_dict = {}
MyModel.objects.create(
user=user,
data=repr(_dict), # data is a CompressedBinaryField.
)
这行得通。
然而,当我检索对象并尝试通过这样的命令使用它时:
item = MyModel.objects.get(user=user)
curr_dict = eval(item.data)
我收到这个错误:
TypeError('eval() arg 1 must be a string or code object',)
出于某种原因,我认为item.data
是一个缓冲区类型的对象。
我做错了什么?
我鼓励您查看 pickle
field for object serialization in Python 数据转换为字节流并存储的位置。
The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”, however, to avoid confusion, the terms used here are “pickling” and “unpickling”.
要检索数据,您需要 "unpickling"。
还有一个被广泛使用的django app called django-picklefield
你的方法的缺点是,你必须考虑很多方面。此外,出于安全等原因,不建议使用 eval
。
我正在尝试在 PostgreSQL 中以压缩二进制形式存储字典。原因是这个字典对象有一天可能会变得太大(数亿个值)。
我找到了这个片段:https://djangosnippets.org/snippets/2014/ 并将其更改为:
class CompressedBinaryField(models.BinaryField):
__metaclass__ = models.SubfieldBase
def to_python(self, value):
if not value:
return value
try:
return value.decode('bz2').decode('utf-8')
except Exception:
return value
def get_prep_value(self, value):
if not value:
return value
try:
value.decode('bz2')
return value
except Exception:
try:
tmp = value.encode('utf-8').encode('bz2')
except Exception:
return value
else:
if len(tmp) > len(value):
return value
return tmp
我正在用这个命令存储字典:
_dict = {}
MyModel.objects.create(
user=user,
data=repr(_dict), # data is a CompressedBinaryField.
)
这行得通。
然而,当我检索对象并尝试通过这样的命令使用它时:
item = MyModel.objects.get(user=user)
curr_dict = eval(item.data)
我收到这个错误:
TypeError('eval() arg 1 must be a string or code object',)
出于某种原因,我认为item.data
是一个缓冲区类型的对象。
我做错了什么?
我鼓励您查看 pickle
field for object serialization in Python 数据转换为字节流并存储的位置。
The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”, however, to avoid confusion, the terms used here are “pickling” and “unpickling”.
要检索数据,您需要 "unpickling"。
还有一个被广泛使用的django app called django-picklefield
你的方法的缺点是,你必须考虑很多方面。此外,出于安全等原因,不建议使用 eval
。