从 shortest_distance 函数返回的距离图缺少某些顶点的条目
Distance map returned from shortest_distance function misses entries of certain vertices
我在 postgres 数据库中有一个网络,我可以在其中使用 pgrouting 扩展进行路由。我已经读入内存,现在想计算0.1小时内所有节点距特定起始节点的距离:
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
其中 wgts 是包含每条边权重的 EdgePropertyMap,nd 是从外部 id 获取顶点索引的反向映射。
在 pgRouting 中,这提供了 349 个可达节点,仅使用图形工具 328 个。结果或多或少相同(例如,最远的节点相同,成本完全相同,两个列表中的节点具有相同的距离), 但图形工具距离图似乎只是错过了某些节点。奇怪的是,我发现了一个标有距离的死胡同节点(下数第二个),但是连接死胡同与外界的节点却不见了。看起来很奇怪,因为如果连接节点不可达,死胡同也将不可达。
我编译了一个MWE:https://gofile.io/d/YpgjSw
下面是python代码:
import graph_tool.all as gt
import numpy as np
import time
# construct list of source, target, edge-id (edge-id not really used in this example)
l = []
with open('nw.txt') as f:
rows = f.readlines()
for row in rows:
id = int(row.split('\t')[0])
source = int(row.split('\t')[1])
target = int(row.split('\t')[2])
l.append([source, target, id])
l.append([target, source, id])
print len(l)
# construct graph
G = gt.Graph(directed=True)
G.ep["edge_id"] = G.new_edge_property("int")
n = G.add_edge_list(l, hashed=True, eprops=G.ep["edge_id"])
# construct a dict for mapping outside node-id's to internal id's (node indexes)
nd = {}
i = 0
for x in n:
nd[x] = i
i = i + 1
# construct a dict for mapping (source, target) combis to a cost and reverse cost
db_wgts = {}
with open('costs.txt') as f:
rows = f.readlines()
for row in rows:
source = int(row.split('\t')[0])
target = int(row.split('\t')[1])
cost = float(row.split('\t')[2])
reverse_cost = float(row.split('\t')[3])
db_wgts[(source, target)] = cost
db_wgts[(target, source)] = reverse_cost
# construct an edge property and fill it according to previous dict
wgts = G.new_edge_property("double")
i = 0
for e in G.edges():
i = i + 1
print i
print e
s = n[int(e.source())]
t = n[int(e.target())]
try:
wgts[e] = db_wgts[(s, t)]
except KeyError:
# this was necessary
wgts[e] = 1000000
# calculate shortest distance to all nodes within 0.1 total cost from source-node with outside-id of 102481678
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
# some mumbo-jumbo for getting the result in a nice node-id: cost format
ar = dm.get_array()
idxs = np.where(dm.get_array() < 0.1)
vals = ar[ar < 0.1]
final_res = [(i, v) for (i,v) in zip(list(idxs[0]), list(vals))]
final_res.sort(key=lambda tup: tup[1])
for x in final_res:
print n[x[0]], x[1]
# output saved in result_missing_nodes.txt
# 328 records, should be 349
为了说明(其中一个)缺失的节点:
>>> dm[nd[63447311]]
0.0696234786274957
>>> dm[nd[106448775]]
0.06165528930577409
>>> dm[nd[127601733]]
inf
>>> dm[nd[100428293]]
0.0819900275163846
>>>
这似乎不可能,因为这是网络的本地布局,标签是上面引用的 id:
这是一个数值精度问题。您具有非常低的边缘权重 (1e-6) 和非常大的值 (1000000),这会导致差异丢失到有限精度。如果将所有值 1000000(我假设是无限权重)替换为 numpy.inf
,您实际上会得到更稳定的计算,并且在您的示例中没有丢失节点。
一个更好的选择是实际删除 "infinite weight"
使用边缘过滤器的边缘:
u = GraphView(G, efilt=wgts.fa < 1000000)
并计算距离。
我在 postgres 数据库中有一个网络,我可以在其中使用 pgrouting 扩展进行路由。我已经读入内存,现在想计算0.1小时内所有节点距特定起始节点的距离:
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
其中 wgts 是包含每条边权重的 EdgePropertyMap,nd 是从外部 id 获取顶点索引的反向映射。
在 pgRouting 中,这提供了 349 个可达节点,仅使用图形工具 328 个。结果或多或少相同(例如,最远的节点相同,成本完全相同,两个列表中的节点具有相同的距离), 但图形工具距离图似乎只是错过了某些节点。奇怪的是,我发现了一个标有距离的死胡同节点(下数第二个),但是连接死胡同与外界的节点却不见了。看起来很奇怪,因为如果连接节点不可达,死胡同也将不可达。
我编译了一个MWE:https://gofile.io/d/YpgjSw
下面是python代码:
import graph_tool.all as gt
import numpy as np
import time
# construct list of source, target, edge-id (edge-id not really used in this example)
l = []
with open('nw.txt') as f:
rows = f.readlines()
for row in rows:
id = int(row.split('\t')[0])
source = int(row.split('\t')[1])
target = int(row.split('\t')[2])
l.append([source, target, id])
l.append([target, source, id])
print len(l)
# construct graph
G = gt.Graph(directed=True)
G.ep["edge_id"] = G.new_edge_property("int")
n = G.add_edge_list(l, hashed=True, eprops=G.ep["edge_id"])
# construct a dict for mapping outside node-id's to internal id's (node indexes)
nd = {}
i = 0
for x in n:
nd[x] = i
i = i + 1
# construct a dict for mapping (source, target) combis to a cost and reverse cost
db_wgts = {}
with open('costs.txt') as f:
rows = f.readlines()
for row in rows:
source = int(row.split('\t')[0])
target = int(row.split('\t')[1])
cost = float(row.split('\t')[2])
reverse_cost = float(row.split('\t')[3])
db_wgts[(source, target)] = cost
db_wgts[(target, source)] = reverse_cost
# construct an edge property and fill it according to previous dict
wgts = G.new_edge_property("double")
i = 0
for e in G.edges():
i = i + 1
print i
print e
s = n[int(e.source())]
t = n[int(e.target())]
try:
wgts[e] = db_wgts[(s, t)]
except KeyError:
# this was necessary
wgts[e] = 1000000
# calculate shortest distance to all nodes within 0.1 total cost from source-node with outside-id of 102481678
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
# some mumbo-jumbo for getting the result in a nice node-id: cost format
ar = dm.get_array()
idxs = np.where(dm.get_array() < 0.1)
vals = ar[ar < 0.1]
final_res = [(i, v) for (i,v) in zip(list(idxs[0]), list(vals))]
final_res.sort(key=lambda tup: tup[1])
for x in final_res:
print n[x[0]], x[1]
# output saved in result_missing_nodes.txt
# 328 records, should be 349
为了说明(其中一个)缺失的节点:
>>> dm[nd[63447311]]
0.0696234786274957
>>> dm[nd[106448775]]
0.06165528930577409
>>> dm[nd[127601733]]
inf
>>> dm[nd[100428293]]
0.0819900275163846
>>>
这似乎不可能,因为这是网络的本地布局,标签是上面引用的 id:
这是一个数值精度问题。您具有非常低的边缘权重 (1e-6) 和非常大的值 (1000000),这会导致差异丢失到有限精度。如果将所有值 1000000(我假设是无限权重)替换为 numpy.inf
,您实际上会得到更稳定的计算,并且在您的示例中没有丢失节点。
一个更好的选择是实际删除 "infinite weight" 使用边缘过滤器的边缘:
u = GraphView(G, efilt=wgts.fa < 1000000)
并计算距离。