Django:以压缩二进制格式存储字典

Django: Store a dictionary in a compressed binary format

我正在尝试在 PostgreSQL 中以压缩二进制形式存储字典。原因是这个字典对象有一天可能会变得太大(数亿个值)。

我找到了这个片段:https://djangosnippets.org/snippets/2014/ 并将其更改为:

class CompressedBinaryField(models.BinaryField):
    __metaclass__ = models.SubfieldBase

    def to_python(self, value):
        if not value:
            return value

        try:
            return value.decode('bz2').decode('utf-8')
        except Exception:
            return value

    def get_prep_value(self, value):
        if not value:
            return value
        try:
            value.decode('bz2')
            return value
        except Exception:
            try:
                tmp = value.encode('utf-8').encode('bz2')
            except Exception:
                return value
            else:
                if len(tmp) > len(value):
                    return value

                return tmp

我正在用这个命令存储字典:

_dict = {}
MyModel.objects.create(
    user=user,
    data=repr(_dict),  # data is a CompressedBinaryField.
)

这行得通。

然而,当我检索对象并尝试通过这样的命令使用它时:

item = MyModel.objects.get(user=user)
curr_dict = eval(item.data)

我收到这个错误:

TypeError('eval() arg 1 must be a string or code object',)

出于某种原因,我认为item.data是一个缓冲区类型的对象。

我做错了什么?

我鼓励您查看 pickle field for object serialization in Python 数据转换为字节流并存储的位置。

The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”, however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

要检索数据,您需要 "unpickling"。

More info here

还有一个被广泛使用的django app called django-picklefield

你的方法的缺点是,你必须考虑很多方面。此外,出于安全等原因,不建议使用 eval