在设置项目时充当 defaultdict 但在获取项目时不充当 defaultdict 的嵌套字典

Nested dictionary that acts as defaultdict when setting items but not when getting items

我想实现一个类似 dict 的数据结构,它具有以下属性:

from collections import UserDict

class TestDict(UserDict):
    pass

test_dict = TestDict()

# Create empty dictionaries at 'level_1' and 'level_2' and insert 'Hello' at the 'level_3' key.
test_dict['level_1']['level_2']['level_3'] = 'Hello'

>>> test_dict
{
    'level_1': {
        'level_2': {
            'level_3': 'Hello'
        }
    }
}

# However, this should not return an empty dictionary but raise a KeyError.
>>> test_dict['unknown_key']
KeyError: 'unknown_key'

据我所知,问题是 python 不知道 __getitem__ 是在设置项目的上下文中调用的,即第一个示例,还是在获取上下文中调用和项目,第二个例子。

我已经看过 Python `defaultdict`: Use default when setting, but not when getting,但我不认为这个问题是重复的,或者它回答了我的问题。

如果您有任何想法,请告诉我。

提前致谢。

编辑:

可以使用以下方法实现类似的效果:

def set_nested_item(dict_in: Union[dict, TestDict], value, keys):
    for i, key in enumerate(keys):
        is_last = i == (len(keys) - 1)
        if is_last:
            dict_in[key] = value
        else:
            if key not in dict_in:
                dict_in[key] = {}
            else:
                if not isinstance(dict_in[key], (dict, TestDict)):
                    dict_in[key] = {}

            dict_in[key] = set_nested_item(dict_in[key], value, keys[(i + 1):])
        return dict_in


class TestDict(UserDict):
    def __init__(self):
        super().__init__()

    def __setitem__(self, key, value):
        if isinstance(key, list):
            self.update(set_nested_item(self, value, key))
        else:
            super().__setitem__(key, value)

test_dict[['level_1', 'level_2', 'level_3']] = 'Hello'
>>> test_dict
{
    'level_1': {
        'level_2': {
            'level_3': 'Hello'
        }
    }
}



这不可能。

test_dict['level_1']['level_2']['level_3'] = 'Hello'

在语义上等同于:

temp1 = test_dict['level_1'] # Should this line fail?
temp1['level_2']['level_3'] = 'Hello'

但是...如果决定实施它,您可以检查 Python 堆栈到 grab/parse 代码的调用行,然后根据调用行是否改变行为代码包含一个赋值!不幸的是,有时调用代码在堆栈跟踪中不可用(例如,当以交互方式调用时),在这种情况下,您需要使用 Python 字节码。

import dis
import inspect
from collections import UserDict

def get_opcodes(code_object, lineno):
    """Utility function to extract Python VM opcodes for line of code"""
    line_ops = []
    instructions = dis.get_instructions(code_object).__iter__()
    for instruction in instructions:
        if instruction.starts_line == lineno:
            # found start of our line
            line_ops.append(instruction.opcode)
            break
    for instruction in instructions:
        if not instruction.starts_line:
            line_ops.append(instruction.opcode)
        else:
            # start of next line
            break
    return line_ops

class TestDict(UserDict):
    def __getitem__(self, key):
        try:
            return super().__getitem__(key)
        except KeyError:
            # inspect the stack to get calling line of code
            frame = inspect.stack()[1].frame
            opcodes = get_opcodes(frame.f_code, frame.f_lineno)
            # STORE_SUBSCR is Python opcode for TOS1[TOS] = TOS2
            if dis.opmap['STORE_SUBSCR'] in opcodes:
                # calling line of code contains a dict/array assignment
                default = TestDict()
                super().__setitem__(key, default)
                return default
            else:
                raise

test_dict = TestDict()
test_dict['level_1']['level_2']['level_3'] = 'Hello'
print(test_dict)
# {'level_1': {'level_2': {'level_3': 'Hello'}}}

test_dict['unknown_key']
# KeyError: 'unknown_key'

以上只是部分解决方案。如果同一行上还有其他 dictionary/array 赋值,它仍然可以被愚弄,例如other['key'] = test_dict['unknown_key']。更完整的解决方案需要实际解析代码行以找出变量在赋值中出现的位置。