从 shortest_distance 函数返回的距离图缺少某些顶点的条目

Distance map returned from shortest_distance function misses entries of certain vertices

我在 postgres 数据库中有一个网络,我可以在其中使用 pgrouting 扩展进行路由。我已经读入内存,现在想计算0.1小时内所有节点距特定起始节点的距离:

dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)

其中 wgts 是包含每条边权重的 EdgePropertyMap,nd 是从外部 id 获取顶点索引的反向映射。

在 pgRouting 中,这提供了 349 个可达节点,仅使用图形工具 328 个。结果或多或少相同(例如,最远的节点相同,成本完全相同,两个列表中的节点具有相同的距离), 但图形工具距离图似乎只是错过了某些节点。奇怪的是,我发现了一个标有距离的死胡同节点(下数第二个),但是连接死胡同与外界的节点却不见了。看起来很奇怪,因为如果连接节点不可达,死胡同也将不可达。

我编译了一个MWE:https://gofile.io/d/YpgjSw

下面是python代码:

import graph_tool.all as gt
import numpy as np
import time

# construct list of source, target, edge-id (edge-id not really used in this example)
l = []
with open('nw.txt') as f:
    rows = f.readlines()
    for row in rows:
        id = int(row.split('\t')[0])
        source = int(row.split('\t')[1])
        target = int(row.split('\t')[2])
        l.append([source, target, id])
        l.append([target, source, id])

print len(l)

# construct graph
G = gt.Graph(directed=True)
G.ep["edge_id"] = G.new_edge_property("int")
n = G.add_edge_list(l, hashed=True, eprops=G.ep["edge_id"])

# construct a dict for mapping outside node-id's to internal id's (node indexes)
nd = {}
i = 0
for x in n:
    nd[x] = i
    i = i + 1

# construct a dict for mapping (source, target) combis to a cost and reverse cost
db_wgts = {}
with open('costs.txt') as f:
    rows = f.readlines()
    for row in rows:
        source = int(row.split('\t')[0])
        target = int(row.split('\t')[1])
        cost = float(row.split('\t')[2])
        reverse_cost = float(row.split('\t')[3])
        db_wgts[(source, target)] = cost
        db_wgts[(target, source)] = reverse_cost

# construct an edge property and fill it according to previous dict
wgts = G.new_edge_property("double")

i = 0
for e in G.edges():
    i = i + 1
    print i
    print e
    s = n[int(e.source())]
    t = n[int(e.target())]
    try:
        wgts[e] = db_wgts[(s, t)]
    except KeyError:
        # this was necessary
        wgts[e] = 1000000


# calculate shortest distance to all nodes within 0.1 total cost from source-node with outside-id of 102481678
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)

# some mumbo-jumbo for getting the result in a nice node-id: cost format
ar = dm.get_array()
idxs = np.where(dm.get_array() < 0.1)
vals = ar[ar < 0.1]
final_res = [(i, v) for (i,v) in zip(list(idxs[0]), list(vals))]
final_res.sort(key=lambda tup: tup[1])  
for x in final_res:
    print n[x[0]], x[1]
# output saved in result_missing_nodes.txt
# 328 records, should be 349

为了说明(其中一个)缺失的节点:

>>> dm[nd[63447311]]
0.0696234786274957
>>> dm[nd[106448775]]
0.06165528930577409
>>> dm[nd[127601733]]
inf
>>> dm[nd[100428293]]
0.0819900275163846
>>> 

这似乎不可能,因为这是网络的本地布局,标签是上面引用的 id:

这是一个数值精度问题。您具有非常低的边缘权重 (1e-6) 和非常大的值 (1000000),这会导致差异丢失到有限精度。如果将所有值 1000000(我假设是无限权重)替换为 numpy.inf,您实际上会得到更稳定的计算,并且在您的示例中没有丢失节点。

一个更好的选择是实际删除 "infinite weight" 使用边缘过滤器的边缘:

u = GraphView(G, efilt=wgts.fa < 1000000)

并计算距离。