嵌套的 defaultdict 的 defaultdict 的 defaultdict 每个都有一个反向引用

Nested `defaultdict of defaultdict of defaultdict` each with a backreference

使用tree = lambda: dedfaultdict(tree),我可以替换下面的代码:

from collections import defaultdict

END = '$'
words = ['hi', 'hello', 'hiya', 'hey']

root = {}
for word in words:
  node = root
  for ch in word:
    node = node.setdefault(ch, {}) # <---- Code that can be replaced
  node[END] = None

与:

from collections import defaultdict

END = '$'
words = ['hi', 'hello', 'hiya', 'hey']

tree = lambda: defaultdict(tree)
root = tree()
for word in words:
  node = root
  for ch in word:
    node = node[ch] # <------ Replaced code
  node[END] = None

我真正想要的是每个字典节点都有一个对其父字典节点的反向引用。我可以这样做:

from collections import defaultdict

BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']

root = {}
for word in words:
  node = root
  for ch in word:
    node = node.setdefault(ch, {BACKREF: node}) # <---- Code I want to replace
  node[END] = None

(证明这有效:link

所以,鉴于我能够使用 tree = lambda: defaultdict(tree) 来替换

有什么方法可以使用 tree = lambda: default(tree) 的修改版本来替换


我试过类似的东西:

def tree():
  _ = defaultdict(tree)
  _[BACKREF] = ?
  return _
root = tree()
h = root['h']

但这需要 tree 知道哪个词典调用了对 tree 的调用。例如。在 h = root['h'] 中,root['h'] 调用对 tree 的调用,因为 h 尚未在 root 中。 tree 必须知道它是通过调用 root['h'] 调用的,以便它可以执行 h[BACKREF] = root。有没有解决的办法?就算能做也是个坏主意吗?

我知道反向引用在技术上意味着 trie 将有循环(而不是真正的树),但按照我计划遍历 trie 的方式,这不会成为问题。我想要反向引用的原因是,如果我想从 trie 中删除一个词,它会很有用。例如,假设我有以下 trie:

并且我在 root['h']['e']['l']['l']['o'] 并且想从 trie 中删除 'hello'。我可以通过将 trie 从 root['h']['e']['l']['l']['o'] 回溯到 root['h']['e']['l']['l'] 再到 root['h']['e']['l'] 再到 root['h']['e'] 来做到这一点(我在这里停下来是因为 len(set(root['h']['e'].keys()) - {BACKREF}) > 1。然后我可以简单地做 del root['h']['e']['l'] 并且我会从 'he' 中切掉 'llo$' 意味着 trie 仍然会有 'hey'。虽然有替代方案,但是使用反向引用回溯 trie 将非常容易.


上下文 tree = lambda: defaultdict(tree)

使用:

from collections import defaultdict

tree = lambda: defaultdict(tree)
root = tree()

可以创建任意嵌套的 dict。例如。之后:

root['h']['i']
root['h']['e']['l']['l']['o']
root['h']['i']['y']['a']
root['h']['e']['y']

root 看起来像:

{'h': {'i': {'y': {'a': {}}}, 'e': {'y': {}, 'l': {'l': {'o': {}}}}}}

这代表一棵看起来像这样的树: 使用 https://www.cs.usfca.edu/~galles/visualization/Trie.html

可视化

您尝试实现的行为似乎更容易写成 class 而不是函数。

from collections import defaultdict

class Tree(defaultdict):
    def __init__(self, backref=None):
        super().__init__(self.make_child)
        self.backref = backref
    def make_child(self):
        return Tree(backref=self)

用法:

>>> root = Tree()
>>> root['h'].backref is root
True
>>> root['h']['e'].backref is root['h']
True

解决方案 1

将相同的 {BACKREF: node} 给予 defaultdict:

from collections import defaultdict

BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']

tree = lambda: defaultdict(tree, {BACKREF: node})
node = None
root = tree()
for word in words:
  node = root
  for ch in word:
    node = node[ch]
  node[END] = None

root节点有backrefNone,嫌麻烦可以删除

解决方案 2

如果该代码是创建树节点的唯一代码(根据我自己构建此类树的时间判断,对我来说似乎很可能),则上述代码工作正常。否则,您需要确保 node 指向正确的 parent 节点。如果这是一个问题,这里有一个 dict(不是 defaultdict)子类的替代方案,它实现了 __missing__ 以在需要时自动创建带有 backrefs 的 children:

BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']

class Tree(dict):
    def __missing__(self, key):
        child = self[key] = Tree({BACKREF: self})
        return child

root = Tree()
for word in words:
  node = root
  for ch in word:
    node = node[ch]
  node[END] = None

也没有给根一个反向引用,作为一个字典,它的字符串表示远没有默认字典的混乱,因此可读性更高:

>>> import pprint
>>> pprint.pp(root)
{'h': {'BACKREF': <Recursion on Tree with id=2494556270320>,
       'i': {'BACKREF': <Recursion on Tree with id=2494556270400>,
             '$': None,
             'y': {'BACKREF': <Recursion on Tree with id=2494556270480>,
                   'a': {'BACKREF': <Recursion on Tree with id=2494556340608>,
                         '$': None}}},
       'e': {'BACKREF': <Recursion on Tree with id=2494556270400>,
             'l': {'BACKREF': <Recursion on Tree with id=2494556340288>,
                   'l': {'BACKREF': <Recursion on Tree with id=2494556340368>,
                         'o': {'BACKREF': <Recursion on Tree with id=2494556340448>,
                               '$': None}}},
             'y': {'BACKREF': <Recursion on Tree with id=2494556340288>,
                   '$': None}}}}

默认比较结果:

>>> pprint.pp(root)
defaultdict(<function <lambda> at 0x000001A13760BE50>,
            {'BACKREF': None,
             'h': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                              {'BACKREF': <Recursion on defaultdict with id=1791930855152>,
                               'i': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                {'BACKREF': <Recursion on defaultdict with id=1791930855312>,
                                                 '$': None,
                                                 'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912832>,
                                                                   'a': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                    {'BACKREF': <Recursion on defaultdict with id=1791930913232>,
                                                                                     '$': None})})}),
                               'e': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                {'BACKREF': <Recursion on defaultdict with id=1791930855312>,
                                                 'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912912>,
                                                                   'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                    {'BACKREF': <Recursion on defaultdict with id=1791930912992>,
                                                                                     'o': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                                                      {'BACKREF': <Recursion on defaultdict with id=1791930913072>,
                                                                                                       '$': None})})}),
                                                 'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
                                                                  {'BACKREF': <Recursion on defaultdict with id=1791930912912>,
                                                                   '$': None})})})})