Python 具有混合数据类型的嵌套 defaultdict

Python nested defaultdict with mix data types

那么,我如何为此创建一个 defaultdict:

{
    'branch': {
        'count': 23,
        'leaf': {
            'tag1': 30,
            'tag2': 10
        }
    },
}

这样,我会默认为 counttag1tag2 取零?我想在读取输入时动态填充字典。当我看到一个新的 branch 时,我想创建一个带有 count 的字典作为零,一个空的字典作为叶子。当我得到 leaf 时,我想用它的名称创建一个键并将值设置为零。

更新: 接受了 Martijn 的回答,因为它有更多的赞成票,但其他答案同样不错。

一个对象有一个 __dict__ 存储数据,并允许您以编程方式设置默认值。还有一个名为 Counter 的对象,我认为您应该使用它来委托您的叶子计数。

因此,我建议您使用具有 collections.Counter:

的对象
import collections

class Branch(object):
    def __init__(self, leafs=(), count=0):
        self.leafs = collections.Counter(leafs)
        self.count = count
    def __repr__(self):
        return 'Branch(leafs={0}, count={1})'.format(self.leafs, self.count)

BRANCHES = [Branch(['leaf1', 'leaf2']),
            Branch(['leaf3', 'leaf4', 'leaf3']),
            Branch(['leaf6', 'leaf7']),
           ]

和用法:

>>> import pprint
>>> pprint.pprint(BRANCHES)
[Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=0),
 Branch(leafs=Counter({'leaf3': 2, 'leaf4': 1}), count=0),
 Branch(leafs=Counter({'leaf7': 1, 'leaf6': 1}), count=0)]
>>> first_branch = BRANCHES[0]
>>> first_branch.count += 23
>>> first_branch
Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=23)
>>> first_branch.leafs['leaf that does not exist']
0
>>> first_branch.leafs.update(['new leaf'])
>>> first_branch
Branch(leafs=Counter({'new leaf': 1, 'leaf1': 1, 'leaf2': 1}), count=23)

您不能使用 defaultdict 执行此操作,因为工厂无权访问密钥。

但是,您 可以 仅子class dict 来创建您自己的 'smart' defaultdict-like class.提供您自己的 __missing__ method,它根据键添加值:

class KeyBasedDefaultDict(dict):
    def __init__(self, default_factories, *args, **kw):
        self._default_factories = default_factories
        super(KeyBasedDefaultDict, self).__init__(*args, **kw)

    def __missing__(self, key):
        factory = self._default_factories.get(key)
        if factory is None:
            raise KeyError(key)
        new_value = factory()
        self[key] = new_value
        return new_value

现在您可以提供自己的映射:

mapping = {'count': int, 'leaf': dict}
mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)

tree = KeyBasedDefaultDict(mapping)

演示:

>>> mapping = {'count': int, 'leaf': dict}
>>> mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)
>>> tree = KeyBasedDefaultDict(mapping)
>>> tree['branch']['count'] += 23
>>> tree['branch']['leaf']['tag1'] = 30
>>> tree['branch']['leaf']['tag2'] = 10
>>> tree
{'branch': {'count': 23, 'leaf': {'tag1': 30, 'tag2': 10}}}

回答我自己的问题,但我认为这也可行:

def branch():
    return {
        'count': 0,
        'leaf': defaultdict(int)
    }

tree = defaultdict(branch)
tree['first_branch']['leaf']['cat2'] = 2
print json.dumps(tree, indent=2)

# {
#   "first_branch": {
#     "count": 0, 
#     "leaf": {
#       "cat2": 2
#     }
#   }
# }