如何使用 networkx 计算与有向图的最低根的总距离

How to calculate overall distances from lowest root(s) of a directed graph with networkx

如果你看一下这个 DAG(有向无环图):

我想创建一个字典,它映射从最低节点到所有其他节点的距离,这类似于渲染图中每个节点从底部开始的 x 位置(高度)。 对于给定的图表,它将是:

distance_nodes_map: {
  0: {'base-zero', 'base-one'}, 
  1: {'low-b', 'low-a', 'low-c'}, 
  3: {'high-x', 'high-z', 'high-y'}, 
  2: {'mid-r', 'mid-q', 'mid-p'}, 
  4: {'super'}
}

我写了一个适用于上面那个图的算法,但后来我测试了另一个图,但它不再起作用了。我尝试了一些算法和函数,例如最短路径或 descendants_at_distance,但我认为它们作为计算距离的输入并没有真正的帮助。

我的算法不适用于此图:

https://gist.github.com/timaschew/3b08a07243fa6f43773014ef5e705c96

这是要点,其中包含:

未经测试的伪代码,因为我的午休时间快结束了:

你有一棵多根树,其中有一个主根。

  1. 为每个根创建一个子图,其中包含该根的所有可达节点。

  2. 从主根(根a)开始,计算distance/shortest到相应子图A中所有节点的根的路径长度。

  3. 找到所有与主子图共享至少一个节点的子图,select与主子图距离最短的节点(节点x)的子图(子图B)主根。

  4. 计算子图B中所有节点到根b的距离。加上距离d(节点x,根a)。减去距离 d(node x, root b).

  5. 创建子图 A 和 B 的并集。重复步骤 3-5 直到没有根。

  6. 减去最大距离&反转符号使得主根具有最大的distance/order值。

编辑:

我的伪代码有效 (*)。我责怪用户错误。 ;-)

#!/usr/bin/env python
"""

"""
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx


def hierarchical_layout(graph):

    longest_path = nx.algorithms.dag.dag_longest_path(graph)
    principal_root = longest_path[0]

    roots = [node for node, degree in list(graph.in_degree) if degree==0]
    subgraphs = {root : create_subgraph(graph, root) for root in roots}

    # Starting with the principal root (root a), compute the
    # longest path length to the root for all nodes in the
    # corresponding subgraph A.
    node_to_level = single_source_longest_dag_path_length(subgraphs[principal_root], principal_root)

    explored = subgraphs[principal_root]
    del subgraphs[principal_root]

    while len(explored) < len(graph):

        # Find all subgraphs that share at least one node with the
        # principal subgraph, and select the subgraph (subgraph B) that
        # has the node (node x) with the smallest distance to the
        # principal root.
        minimum_cost = np.inf
        minimum_cost_node = None
        minimum_cost_root = None
        for root, subgraph in subgraphs.items():
            for node in subgraph.nodes:
                if node in node_to_level:
                    if node_to_level[node] < minimum_cost:
                        minimum_cost = node_to_level[node]
                        minimum_cost_node = node
                        minimum_cost_root = root

        assert minimum_cost_node, "Could not find a connected subgraph."

        # Compute the distance to the root b for all nodes in subgraph
        # B. Add the distance d(node x, root a). Subtract the distance
        # d(node x, root b).
        path_lengths = [len(path) for path in nx.all_simple_paths(subgraphs[minimum_cost_root], minimum_cost_root, minimum_cost_node)]
        offset = np.max(path_lengths) - 1
        for node, distance in single_source_longest_dag_path_length(subgraphs[minimum_cost_root], minimum_cost_root).items():
            if not node in node_to_level:
                node_to_level[node] = distance + minimum_cost - offset

        # Create the union of subgraph A and B.
        explored = nx.compose(explored, subgraphs[minimum_cost_root])
        del subgraphs[minimum_cost_root]

    return node_to_level


def create_subgraph(G, node):
    # 
    nodes = nx.single_source_shortest_path(G,node).keys()
    return G.subgraph(nodes)


def single_source_longest_dag_path_length(graph, s):
    # from AlaskaJoslin's comment to 
    dist = dict.fromkeys(graph.nodes, -float('inf'))
    dist[s] = 0
    topo_order = nx.topological_sort(graph)
    for n in topo_order:
        for s in graph.successors(n):
            if dist[s] < dist[n] + 1:
                dist[s] = dist[n] + 1
    return dist


if __name__ == '__main__':

    # edge_list = [
    #     ("n10", "n11"),
    #     ("n11", "n12"),
    #     ("n12", "n13"),
    #     ("n13", "n14"),
    #     ("n20", "n21"),
    #     ("n20", "n14"),
    #     ("n21", "n22"),
    #     ("n22", "n23"),
    #     ("n30", "n23"),
    #     ("n30", "n31"),
    #     ("n31", "n32"),
    # ]

    edge_list = [
        ("low-a", "base-one"),
        ("low-c", "base-zero"),
        ("low-c", "base-one"),
        ("mid-p", "low-b"),
        ("mid-p", "low-c"),
        ("mid-q", "low-a"),
        ("high-x", "mid-p"),
        ("high-y", "mid-p"),
        ("high-y", "base-zero"),
        ("high-z", "mid-q"),
        ("high-z", "mid-r"),
        ("high-z", "base-one"),
        ("super", "high-x"),
    ]

    graph = nx.DiGraph()
    graph.add_edges_from(edge_list)

    node_to_level = hierarchical_layout(graph)

    # reverse output format
    distance_nodes_map = dict()
    max_distance = np.max(list(node_to_level.values()))
    for node, distance in node_to_level.items():
        reversed_distance = max_distance - distance
        if reversed_distance in distance_nodes_map:
            distance_nodes_map[reversed_distance].add(node)
        else:
            distance_nodes_map[reversed_distance] = set([node])

    # print(distance_nodes_map)
    for ii, nodes in sorted(distance_nodes_map.items())[::-1]:
        print(f"{ii} : {nodes}")

产量:

    # 4 : {'super'}
    # 3 : {'high-x', 'high-y', 'high-z'}
    # 2 : {'mid-p', 'mid-r', 'mid-q'}
    # 1 : {'low-a', 'low-b', 'low-c'}
    # 0 : {'base-one', 'base-zero'}

(*)“减去距离d(节点x,根b)”自然意味着节点x和根b之间的最长路径长度。很明显。

您正在寻找一种绘制分层图的算法。有许多不同的算法,您应该选择最适合您需要的算法(例如,查看 Gansner 等人 的以下论文 A Technique for Drawing Directed Graphs)。

其中许多算法已经在 Graphviz(非常著名且功能强大的图形可视化软件)中实现。安装后,计算您要查找的结果非常简单(G 是使用 networkx.DiGraph 构建的有向无环图):

from networkx.drawing.nx_agraph import graphviz_layout

def get_distance_nodes_map(G):
    pos = graphviz_layout(G, prog='dot')
    coor = sorted({y for k, (x, y) in pos.items()})
    kmap = dict(zip(coor, range(len(coor))))
    distance_nodes_map = {level: set() for level in kmap.values()}
    for k, (x, y) in pos.items():
        distance_nodes_map[kmap[y]].add(k)
    return distance_nodes_map

以下是使用您提供的数据的几个示例:

>>> from networkx import DiGraph
>>> from pprint import PrettyPrinter
>>> pp = PrettyPrinter()
>>> G1 = DiGraph()
>>> G1.add_edges_from([('super', 'high-x'), ('high-x', 'mid-p'),
...                    ('mid-p', 'low-b'), ('mid-p', 'low-c'),
...                    ('low-c', 'base-zero'), ('low-c', 'base-one'),
...                    ('high-y', 'mid-p'), ('high-y', 'base-zero'),
...                    ('high-z', 'base-one'), ('high-z', 'mid-r'),
...                    ('high-z', 'mid-q'), ('mid-q', 'low-a'),
...                    ('low-a', 'base-one')])
>>> pp.pprint(get_distance_nodes_map(G1))
{0: {'base-one', 'base-zero'},
 1: {'low-a', 'low-b', 'low-c'},
 2: {'mid-p', 'mid-r', 'mid-q'},
 3: {'high-y', 'high-x', 'high-z'},
 4: {'super'}}
>>> G2 = DiGraph()
>>> G2.add_edges_from([('n10', 'n11'), ('n11', 'n12'), ('n12', 'n13'),
...                    ('n13', 'n14'), ('n20', 'n14'), ('n20', 'n21'),
...                    ('n21', 'n22'), ('n22', 'n23'), ('n30', 'n23'),
...                    ('n30', 'n31'), ('n31', 'n32')])
>>> pp.pprint(get_distance_nodes_map(G2))
{0: {'n32'},
 1: {'n31', 'n23'},
 2: {'n30', 'n22'},
 3: {'n21', 'n14'},
 4: {'n13', 'n20'},
 5: {'n12'},
 6: {'n11'},
 7: {'n10'}}