在设置项目时充当 defaultdict 但在获取项目时不充当 defaultdict 的嵌套字典
Nested dictionary that acts as defaultdict when setting items but not when getting items
我想实现一个类似 dict 的数据结构,它具有以下属性:
from collections import UserDict
class TestDict(UserDict):
pass
test_dict = TestDict()
# Create empty dictionaries at 'level_1' and 'level_2' and insert 'Hello' at the 'level_3' key.
test_dict['level_1']['level_2']['level_3'] = 'Hello'
>>> test_dict
{
'level_1': {
'level_2': {
'level_3': 'Hello'
}
}
}
# However, this should not return an empty dictionary but raise a KeyError.
>>> test_dict['unknown_key']
KeyError: 'unknown_key'
据我所知,问题是 python 不知道 __getitem__
是在设置项目的上下文中调用的,即第一个示例,还是在获取上下文中调用和项目,第二个例子。
我已经看过 Python `defaultdict`: Use default when setting, but not when getting,但我不认为这个问题是重复的,或者它回答了我的问题。
如果您有任何想法,请告诉我。
提前致谢。
编辑:
可以使用以下方法实现类似的效果:
def set_nested_item(dict_in: Union[dict, TestDict], value, keys):
for i, key in enumerate(keys):
is_last = i == (len(keys) - 1)
if is_last:
dict_in[key] = value
else:
if key not in dict_in:
dict_in[key] = {}
else:
if not isinstance(dict_in[key], (dict, TestDict)):
dict_in[key] = {}
dict_in[key] = set_nested_item(dict_in[key], value, keys[(i + 1):])
return dict_in
class TestDict(UserDict):
def __init__(self):
super().__init__()
def __setitem__(self, key, value):
if isinstance(key, list):
self.update(set_nested_item(self, value, key))
else:
super().__setitem__(key, value)
test_dict[['level_1', 'level_2', 'level_3']] = 'Hello'
>>> test_dict
{
'level_1': {
'level_2': {
'level_3': 'Hello'
}
}
}
这不可能。
test_dict['level_1']['level_2']['level_3'] = 'Hello'
在语义上等同于:
temp1 = test_dict['level_1'] # Should this line fail?
temp1['level_2']['level_3'] = 'Hello'
但是...如果决定实施它,您可以检查 Python 堆栈到 grab/parse 代码的调用行,然后根据调用行是否改变行为代码包含一个赋值!不幸的是,有时调用代码在堆栈跟踪中不可用(例如,当以交互方式调用时),在这种情况下,您需要使用 Python 字节码。
import dis
import inspect
from collections import UserDict
def get_opcodes(code_object, lineno):
"""Utility function to extract Python VM opcodes for line of code"""
line_ops = []
instructions = dis.get_instructions(code_object).__iter__()
for instruction in instructions:
if instruction.starts_line == lineno:
# found start of our line
line_ops.append(instruction.opcode)
break
for instruction in instructions:
if not instruction.starts_line:
line_ops.append(instruction.opcode)
else:
# start of next line
break
return line_ops
class TestDict(UserDict):
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
# inspect the stack to get calling line of code
frame = inspect.stack()[1].frame
opcodes = get_opcodes(frame.f_code, frame.f_lineno)
# STORE_SUBSCR is Python opcode for TOS1[TOS] = TOS2
if dis.opmap['STORE_SUBSCR'] in opcodes:
# calling line of code contains a dict/array assignment
default = TestDict()
super().__setitem__(key, default)
return default
else:
raise
test_dict = TestDict()
test_dict['level_1']['level_2']['level_3'] = 'Hello'
print(test_dict)
# {'level_1': {'level_2': {'level_3': 'Hello'}}}
test_dict['unknown_key']
# KeyError: 'unknown_key'
以上只是部分解决方案。如果同一行上还有其他 dictionary/array 赋值,它仍然可以被愚弄,例如other['key'] = test_dict['unknown_key']
。更完整的解决方案需要实际解析代码行以找出变量在赋值中出现的位置。
我想实现一个类似 dict 的数据结构,它具有以下属性:
from collections import UserDict
class TestDict(UserDict):
pass
test_dict = TestDict()
# Create empty dictionaries at 'level_1' and 'level_2' and insert 'Hello' at the 'level_3' key.
test_dict['level_1']['level_2']['level_3'] = 'Hello'
>>> test_dict
{
'level_1': {
'level_2': {
'level_3': 'Hello'
}
}
}
# However, this should not return an empty dictionary but raise a KeyError.
>>> test_dict['unknown_key']
KeyError: 'unknown_key'
据我所知,问题是 python 不知道 __getitem__
是在设置项目的上下文中调用的,即第一个示例,还是在获取上下文中调用和项目,第二个例子。
我已经看过 Python `defaultdict`: Use default when setting, but not when getting,但我不认为这个问题是重复的,或者它回答了我的问题。
如果您有任何想法,请告诉我。
提前致谢。
编辑:
可以使用以下方法实现类似的效果:
def set_nested_item(dict_in: Union[dict, TestDict], value, keys):
for i, key in enumerate(keys):
is_last = i == (len(keys) - 1)
if is_last:
dict_in[key] = value
else:
if key not in dict_in:
dict_in[key] = {}
else:
if not isinstance(dict_in[key], (dict, TestDict)):
dict_in[key] = {}
dict_in[key] = set_nested_item(dict_in[key], value, keys[(i + 1):])
return dict_in
class TestDict(UserDict):
def __init__(self):
super().__init__()
def __setitem__(self, key, value):
if isinstance(key, list):
self.update(set_nested_item(self, value, key))
else:
super().__setitem__(key, value)
test_dict[['level_1', 'level_2', 'level_3']] = 'Hello'
>>> test_dict
{
'level_1': {
'level_2': {
'level_3': 'Hello'
}
}
}
这不可能。
test_dict['level_1']['level_2']['level_3'] = 'Hello'
在语义上等同于:
temp1 = test_dict['level_1'] # Should this line fail?
temp1['level_2']['level_3'] = 'Hello'
但是...如果决定实施它,您可以检查 Python 堆栈到 grab/parse 代码的调用行,然后根据调用行是否改变行为代码包含一个赋值!不幸的是,有时调用代码在堆栈跟踪中不可用(例如,当以交互方式调用时),在这种情况下,您需要使用 Python 字节码。
import dis
import inspect
from collections import UserDict
def get_opcodes(code_object, lineno):
"""Utility function to extract Python VM opcodes for line of code"""
line_ops = []
instructions = dis.get_instructions(code_object).__iter__()
for instruction in instructions:
if instruction.starts_line == lineno:
# found start of our line
line_ops.append(instruction.opcode)
break
for instruction in instructions:
if not instruction.starts_line:
line_ops.append(instruction.opcode)
else:
# start of next line
break
return line_ops
class TestDict(UserDict):
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
# inspect the stack to get calling line of code
frame = inspect.stack()[1].frame
opcodes = get_opcodes(frame.f_code, frame.f_lineno)
# STORE_SUBSCR is Python opcode for TOS1[TOS] = TOS2
if dis.opmap['STORE_SUBSCR'] in opcodes:
# calling line of code contains a dict/array assignment
default = TestDict()
super().__setitem__(key, default)
return default
else:
raise
test_dict = TestDict()
test_dict['level_1']['level_2']['level_3'] = 'Hello'
print(test_dict)
# {'level_1': {'level_2': {'level_3': 'Hello'}}}
test_dict['unknown_key']
# KeyError: 'unknown_key'
以上只是部分解决方案。如果同一行上还有其他 dictionary/array 赋值,它仍然可以被愚弄,例如other['key'] = test_dict['unknown_key']
。更完整的解决方案需要实际解析代码行以找出变量在赋值中出现的位置。