如何在 pickle 加载期间用 None 替换导致导入错误的对象?

How to replace objects causing import errors with None during pickle load?

我有一个 pickled 结构,由嵌套的内置基元(列表、字典)和 classes 的实例组成,这些实例不再存在于项目中,因此在 unpickling 期间会导致错误。我不太关心那些对象,我希望我可以提取存储在这个嵌套结构中的数值。有什么方法可以从文件中解开并替换由于导入问题而损坏的所有内容,比方说,None?

我试图从 Unpickler 继承并覆盖 find_class(self, module, name) 到 return Dummy 如果找不到 class,但由于某种原因我之后在 load reduce 中继续获得 TypeError: 'NoneType' object is not callable

class Dummy(object):
    def __init__(self, *argv, **kwargs):
        pass

我试过

class RobustJoblibUnpickle(Unpickler):
    def find_class(self, _module, name):
        try:
            super(RobustJoblibUnpickle, self).find_class(_module, name)
        except ImportError:
            return Dummy

也许你可以在 try 块中捕获异常,然后做你想做的事(将一些对象设置为 None 使用 Dummy class)?

编辑:

看看这个,我不知道这样做是否正确,但它似乎工作正常:

import sys
import pickle

class Dummy:
    pass

class MyUnpickler(pickle._Unpickler):
    def find_class(self, module, name): # from the pickle module code but with a try
        # Subclasses may override this. # we are doing it right now...
        try:
            if self.proto < 3 and self.fix_imports:
                if (module, name) in _compat_pickle.NAME_MAPPING:
                    module, name = _compat_pickle.NAME_MAPPING[(module, name)]
                elif module in _compat_pickle.IMPORT_MAPPING:
                    module = _compat_pickle.IMPORT_MAPPING[module]
            __import__(module, level=0)
            if self.proto >= 4:
                return _getattribute(sys.modules[module], name)[0]
            else:
                return getattr(sys.modules[module], name)
        except AttributeError:
            return Dummy

# edit: as per Ben suggestion an even simpler subclass can be used
# instead of the above

class MyUnpickler2(pickle._Unpickler):
    def find_class(self, module, name):
        try:
            return super().find_class(module, name)
        except AttributeError:
            return Dummy

class C:
    pass

c1 = C()

with open('data1.dat', 'wb') as f:
    pickle.dump(c1,f)

del C # simulate the missing class

with open('data1.dat', 'rb') as f:
    unpickler = MyUnpickler(f) # or MyUnpickler2(f)
    c1 = unpickler.load()

print(c1) # got a Dummy object because of missing class