如何从邻接列表构建嵌套树结构?

How to build a nested tree structure from a list of adjacencies?

考虑到我有:


A = [(61, 66), (50, 61), (68, 61), (33, 61), (57, 66), (72, 66), (37, 68), (71, 33), (6, 50), (11, 37), (5, 37)]

class Tree:
    def __init__(self, node, *children):
        self.node = node
        if children: self.children = children
        else: self.children = []
    
    def __str__(self): 
        return "%s" % (self.node)
    def __repr__(self):
        return "%s" % (self.node)

    def __getitem__(self, k):
        if isinstance(k, int) or isinstance(k, slice): 
            return self.children[k]
        if isinstance(k, str):
            for child in self.children:
                if child.node == k: return child

    def __iter__(self): return self.children.__iter__()

    def __len__(self): return len(self.children)

如何构建一个 Tree 对象,使其根据邻接关系封装所有内部树?(如下所示)

t = Tree(66, 
        Tree(72), 
        Tree(57), 
        Tree(61, 
            Tree(33,
                Tree(71)), 
            Tree(50, 
                Tree(6)), 
            Tree(68, 
                Tree(37, 
                    Tree(11), Tree(5)))))

我正在考虑以递归方式创建树,但我不知道如何正确地遍历和填充它。这是我失败的尝试:

from collections import defaultdict

# Create a dictionary: key = parent, values = children
d = defaultdict(list)
for child, parent in A:
    d[parent].append(child)

# Failed attempt
def build_tree(k):    
    if k in d:
        tree = Tree(k, d[k]) #1st issue: should input a Tree() as 2nd parameter
        for child in d[k]:
            build_tree(child) #2nd issue: should populate tree, not iterate recursively over children keys

#I know that the root node is 66.
full_tree = build_tree(66)
        

你很接近。关键是return 将新节点返回父节点并将其追加到父节点的子节点列表中。如果您的父列表在初始化时是固定的,只需使用一个临时列表,然后在访问并创建子项后创建父项。

这是一个最小的例子:

from collections import defaultdict, namedtuple

def build_tree(tree, root):
    if root:
        return Node(root, [build_tree(tree, x) for x in tree.get(root, [])])

def print_tree(root, indent=0):
    if root:
        print(" " * indent + str(root.val))
        
        for child in root.children:
            print_tree(child, indent + 2)

if __name__ == "__main__":
    A = [(61, 66), (50, 61), (68, 61), (33, 61), (57, 66), (72, 66), 
         (37, 68), (71, 33), (6, 50), (11, 37), (5, 37)]
    Node = namedtuple("Node", "val children")
    nodes = defaultdict(list)
    
    for child, parent in A:
        nodes[parent].append(child)

    print_tree(build_tree(nodes, 66))

输出:

66
  61
    50
      6
    68
      37
        11
        5
    33
      71
  57
  72

你在这段代码中提到了两个问题:

    tree = Tree(k, d[k]) #1st issue: should input a Tree() as 2nd parameter
    for child in d[k]:
        build_tree(child) #2nd issue: should populate tree, not iterate recursively over children keys

您可以通过将 for 循环移动到第二个参数来解决它们,以列表理解的形式并展开该列表,使它们成为参数。然后确保你的递归函数 returns 创建的树:

    return Tree(k, 
        *[build_tree(child) for child in d[k]]
    )

更多想法

与您的问题无关,但这里还有一些您可以使用的想法。

  • 最好让你的代码成为一个函数,你可以将 A 作为参数传递给它,这样字典的范围也只是该函数的本地范围,并且不会乱扔垃圾全局范围。

  • 由于此功能与 Tree class 密切相关,最好将其定义为 [=] 中的静态或 class 方法42=].

  • 当您拥有树的(子、父)元组时,这些元组将隐式定义哪个节点是根节点,因此您可以省略将文字 66 传递给您的函数。该函数应该能够自己找出哪个是根。在创建字典时,它还可以收集哪些节点有父节点。根就是不在该集合中的节点。

所以把所有这些放在一起你会得到这个:

from collections import defaultdict

class Tree:
    def __init__(self, node, *children):
        self.node = node
        self.children = children if children else []
    
    def __str__(self): 
        return "%s" % (self.node)
    
    def __repr__(self):
        return "%s" % (self.node)

    def __getitem__(self, k):
        if isinstance(k, int) or isinstance(k, slice): 
            return self.children[k]
        if isinstance(k, str):
            for child in self.children:
                if child.node == k:
                    return child

    def __iter__(self):
        return self.children.__iter__()

    def __len__(self):
        return len(self.children)

    @classmethod
    def from_pairs(Cls, pairs):
        # Turn pairs into nested dictionary
        d = defaultdict(list)
        children = set()
        for child, parent in pairs:
            d[parent].append(child)
            # collect nodes that have a parent
            children.add(child)
        
        # Find root: it does not have a parent
        root = next(parent for parent in d if parent not in children)

        # Build nested Tree instances recursively from the dictionary
        def subtree(k):
            return Cls(k, *[subtree(child) for child in d[k]])

        return subtree(root)

# Sample run
A = [(61, 66), (50, 61), (68, 61), (33, 61), (57, 66), (72, 66), (37, 68), (71, 33), (6, 50), (11, 37), (5, 37)]

tree = Tree.from_pairs(A)

这是一个了解可重用模块和相互递归的机会。此答案中的解决方案解决了您的特定问题,而无需对 1 中编写的模块进行任何修改。指出这一点很重要,因为它展示了泛型函数如何促进代码重用并减少错误进入您的程序的机会。

首先,我们将定义特定于您的 (id, parent) 输入结构形状的函数 -

# main.py

def id(node):
  return node[0]

def parent(node):
  return node[1]

n = (12,34)

id(n)     # => 12
parent(n) # => 34

也许知道根节点是66,但这对我们的程序来说很难推断但对我们来说很容易定义。让我们在输入数据中明确包含 (66, None),其中 parent=None 表示 root 节点 -

A = \
  [ (61, 66), (50, 61), (68, 61), (33, 61)
  , (57, 66), (72, 66), (37, 68), (71, 33)
  , (6, 50), (11, 37), (5, 37), (66, None) # don't forget root node, 66
  ]

现在我们可以使用 tree 模块轻松构建我们的树 -

# main.py

from tree import tree

def id #...
def parent #...

A = [ ... ]

B = tree \
  ( A                                # list of nodes
  , parent                           # foreign key
  , lambda node, children:           # node reconstructor
      (id(node), children(id(node))) # primary key 
  )

print(B)
# [(66, [(61, [(50, [(6, [])]), (68, [(37, [(11, []), (5, [])])]), (33, [(71, [])])]), (57, []), (72, [])])]

注意 tree 如何不关心输入的形状; 任何节点结构都可以使用。 tree 函数很灵活,允许我们构造与输入节点完全不同形状的树节点 -

# main.py

from tree import tree
from json import dumps

def id #...
def parent #...

A = [ ... ]

C = tree \
  ( A
  , parent
  , lambda node, children:
      dict([("id", id(node)), ("children", children(id(node)))])
  )

print(dumps(C))
[ { "id": 66
  , "children":
      [ { "id": 61
        , "children":
            [ { "id": 50
              , "children":
                  [ { "id": 6, "children": [] }
                  ]
              }
            , { "id": 68
              , "children":
                [ { "id": 37
                  , "children":
                      [ { "id": 11, "children": [] }
                      , { "id": 5, "children": [] }
                      ]
                  }
                ]
              }
            , { "id": 33
              , "children":
                  [ { "id": 71, "children": [] }
                  ]
              }
            ]
        }
      , { "id": 57, "children": [] }
      , { "id": 72, "children": [] }
      ]
  }
]

现在我们可以看看tree的实现。请注意 tree 如何不对输入节点的形状做出任何假设 -

# tree.py

from index import index, get

def empty():
  return []

def tree (all, indexer, maker, root = None):
  mem = index(all, indexer)

  def many(all):
    return list(map(one, all))
  
  def one(single):
    return maker(single, lambda r: many(get(mem, r, empty())))
  
  return many(get(mem, root))

我们对 tree 的实现依赖于另一个模块 index。分组数据结构,如 index,以及对这些数据结构进行操作的函数是在模块之间划定界限的好方法。这里也没有关于输入形状的假设 -

# index.py

from functools import reduce

def empty():
  return {}

def update(t, k, f):
  if k in t:
    return { **t, k: f(get(t, k)) }
  else:
    return { **t, k: f() }

def get(t, k, default = None):
  if k in t:
    return t[k]
  else:
    return default

def append(t, k, v):
  return update(t, k, lambda r = []: [ *r, v ])

def index(ls, indexer):
  return reduce \
    ( lambda t, v: append(t, indexer(v), v)
    , ls
    , empty()
    )

通过 运行 在您的浏览器中验证我们的结果:run this program on repl.it


1 个模块移植到 Python。原程序写在JavaScript.