使用 python 3 解封 python 2 对象

Question

我想知道是否有一种方法可以加载在 Python 2.4 和 Python 3.4.

中腌制的对象

我已经运行 2to3 处理大量公司遗留代码以使其更新。

完成此操作后，当运行文件出现以下错误时：

  File "H:\fixers - 3.4\addressfixer - 3.4\trunk\lib\address\address_generic.py"
, line 382, in read_ref_files
    d = pickle.load(open(mshelffile, 'rb'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal
not in range(128)

查看争用中的 pickled 对象，它是 dict 中的 dict，包含 str.

类型的键和值

所以我的问题是：有没有办法加载一个对象，最初在 python 2.4 中腌制，用 python 3.4？

Answer 1

您必须告诉 pickle.load() 如何将 Python 字节串数据转换为 Python 3 个字符串，或者您可以告诉 pickle 将它们保留为字节。

默认尝试将所有字符串数据解码为 ASCII，但解码失败。见 pickle.load() documentation:

Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

将编码设置为latin1允许您直接导入数据：

with open(mshelffile, 'rb') as f:
    d = pickle.load(f, encoding='latin1')

但您需要验证 none 字符串是使用错误的编解码器解码的； Latin-1 适用于任何输入，因为它将字节值 0-255 直接映射到前 256 个 Unicode 代码点。

另一种方法是使用 encoding='bytes' 加载数据，然后解码所有 bytes 键和值。

请注意，3.6.8、3.7.2 和 3.8.0 之前的 Python 个版本，unpickling of Python 2 datetime object data is broken 除非您使用 encoding='bytes'。

Answer 2

当您的对象中包含 numpy 数组时，使用 encoding='latin1' 会导致一些问题。

使用encoding='bytes'会更好

有关使用 encoding='bytes'

的完整说明，请参阅此

使用 python 3 解封 python 2 对象

Unpickling a python 2 object with python 3

python

pickle

python-2.4

python-2to3

python-3.x