从节点列表和边列表中查找连通性
Finding connectedness from a list of nodes and a list of edges
(tl;dr)
Given a collection of nodes defined as a dictionary of points, and a collection of edges defined as a dictionary of key tuples, is there an algorithm in python to easily find consecutive segments?
(上下文:)
我有两个模拟路网路段的文件。
1;-5226574;-3118329 # latitude and longitude as integers (multiplied by 1e5)
2;-5226702;-3118330
3;-5226750;-3118332
4;-5226793;-3118338
...
1;1;2
2;3;5
3;23;345
4;23;11
...
每条边代表两个节点之间的(索引)link,由节点索引。
生成的网络的一个子部分绘制如下:
如你所见,绝大多数节点是"simple"节点,这意味着它位于路段的中间并且属于恰好两条边。另一方面,有 "special" 个节点,表示它们代表分叉点或十字路口,因为它属于三个或更多边。
目前,我有一组孤立的路段,但我想将两个特殊节点之间的每个路段定义为节点的序列。它使绘制、测量距离等一切变得更快,它还允许我将每个节点序列表示为 "super edge" linking 两个特殊节点,从而简化拓扑。
我可以很容易地想象出一些蛮力的方法来做到这一点,但是节点的数量相对较多,而且我没有理论背景来指出我解决这个问题的方法。
更新:
我有 created a gist 原始数据。每条线将一条道路表示为一系列点(纬度、经度),并且这些道路有很多重叠。我的目标是从文件中的 "list of roads" 生成节点和 links 的字典。
您可以使用以下 python 脚本访问内容:
with open('RawRoads.txt') as roadsFile:
for line in roadsFile.readlines():
road = [tuple(map(lambda x:int(float(x)*1e5), coord.split(','))) for coord in line.strip().split(' ')]
否则:
import urllib
url = "https://gist.githubusercontent.com/heltonbiker/ca043f8ee191db5bf8349b1b7af0394c/raw/RawRoads.txt"
lines = urllib.urlopen(url).readlines()
for line in lines:
# you got the idea
让我们不要太野蛮。我认为我们可以通过构建一个简单的列表列表来做得很好,例如 edge[i] 是一个最多三个元素的列表,节点 i 已连接。如果你的节点数很密集并且从 0 附近开始,你可以使用列表;如果他们不是,我将使用一个目录。
我从 edges.txt 以
的形式构建了一个列表
edge_list =
[(1,2),
(2,3),
(3,5),
(2, 23),
(23,345),
(23,11), ...]
现在构建双向边参考目录:
接下来,识别特殊节点,即顺序不是 2 的节点:交叉点和地图边缘。然后我们选择一个并构建一个段,直到我们击中另一个。
# Dictionary of edges, indexed in both directions by node number.
edge = {}
# Ingest the data and build teh dictionary
with open("edges.txt") as efile:
for line in efile:
eid, src, dst = line.strip().split(';')
src = int(src)
dst = int(dst)
for key, val in [(src, dst), (dst, src)] :
if key in edge:
edge[key].append(val)
else:
edge[key] = ([val])
print "edge dictionary has entries:", len(edge)
# Identify endpoint nodes: order other than 2
end_ct = 0
print "Endpoint Nodes"
endpoint = []
for src, dst in edge.iteritems():
if len(dst) != 2:
print len(dst), src, dst
endpoint.append(src)
end_ct += len(dst)
print end_ct, "road ends"
atlas = [] # List of roads, each a list of nodes
# Build roads between the identified endpoints
# Pick the first endpoint in the remaining list.
# Move to the first-listed adjacent node.
# Keep going until we hit another node on the endpoint list.
while len(endpoint) > 0:
here = endpoint[0]
# print "Road starting at", here, edge[here]
# Pick a first step and consume the edge
next = edge[here].pop(0)
edge[next].remove(here)
road = [here, next]
# If that's the last connection to the node, remove that node from the endpoints list.
if len(edge[here]) == 0:
del endpoint[0]
del edge[here]
# Special case for a one-segment road; endpoint entry of "next" is removed after the loop
if len(edge[next]) == 0:
del edge[next]
# Consume edges until we reach another endpoint.
debug = False
while next not in endpoint:
here = next
next = edge[here].pop(0)
edge[next].remove(here)
road.append(next)
if len(edge[next]) == 0:
del edge[next]
# print "removing node", next
if next not in edge:
endpoint.remove(next)
# print "removing endpoint", next
print "\nRoad from", road[0], "to", road[-1], ':\n\t', road
atlas.append(road)
print "\n", len(atlas), "roads built"
# print "edge dictionary still has entries:", len(edge)
根据 OP 编辑:
它有效,快速且正确,我发现它值得可视化:
import matplotlib.pyplot as plt
for road in atlas:
path = [nodesdict[i] for i in road]
lons, lats = zip(*path)
plt.plot(lons, lats)
plt.grid()
plt.axis('equal')
plt.show()
(tl;dr)
Given a collection of nodes defined as a dictionary of points, and a collection of edges defined as a dictionary of key tuples, is there an algorithm in python to easily find consecutive segments?
(上下文:)
我有两个模拟路网路段的文件。
1;-5226574;-3118329 # latitude and longitude as integers (multiplied by 1e5)
2;-5226702;-3118330
3;-5226750;-3118332
4;-5226793;-3118338
...
1;1;2
2;3;5
3;23;345
4;23;11
...
每条边代表两个节点之间的(索引)link,由节点索引。
生成的网络的一个子部分绘制如下:
如你所见,绝大多数节点是"simple"节点,这意味着它位于路段的中间并且属于恰好两条边。另一方面,有 "special" 个节点,表示它们代表分叉点或十字路口,因为它属于三个或更多边。
目前,我有一组孤立的路段,但我想将两个特殊节点之间的每个路段定义为节点的序列。它使绘制、测量距离等一切变得更快,它还允许我将每个节点序列表示为 "super edge" linking 两个特殊节点,从而简化拓扑。
我可以很容易地想象出一些蛮力的方法来做到这一点,但是节点的数量相对较多,而且我没有理论背景来指出我解决这个问题的方法。
更新:
我有 created a gist 原始数据。每条线将一条道路表示为一系列点(纬度、经度),并且这些道路有很多重叠。我的目标是从文件中的 "list of roads" 生成节点和 links 的字典。
您可以使用以下 python 脚本访问内容:
with open('RawRoads.txt') as roadsFile:
for line in roadsFile.readlines():
road = [tuple(map(lambda x:int(float(x)*1e5), coord.split(','))) for coord in line.strip().split(' ')]
否则:
import urllib
url = "https://gist.githubusercontent.com/heltonbiker/ca043f8ee191db5bf8349b1b7af0394c/raw/RawRoads.txt"
lines = urllib.urlopen(url).readlines()
for line in lines:
# you got the idea
让我们不要太野蛮。我认为我们可以通过构建一个简单的列表列表来做得很好,例如 edge[i] 是一个最多三个元素的列表,节点 i 已连接。如果你的节点数很密集并且从 0 附近开始,你可以使用列表;如果他们不是,我将使用一个目录。
我从 edges.txt 以
的形式构建了一个列表edge_list = [(1,2), (2,3), (3,5), (2, 23), (23,345), (23,11), ...]
现在构建双向边参考目录:
接下来,识别特殊节点,即顺序不是 2 的节点:交叉点和地图边缘。然后我们选择一个并构建一个段,直到我们击中另一个。
# Dictionary of edges, indexed in both directions by node number.
edge = {}
# Ingest the data and build teh dictionary
with open("edges.txt") as efile:
for line in efile:
eid, src, dst = line.strip().split(';')
src = int(src)
dst = int(dst)
for key, val in [(src, dst), (dst, src)] :
if key in edge:
edge[key].append(val)
else:
edge[key] = ([val])
print "edge dictionary has entries:", len(edge)
# Identify endpoint nodes: order other than 2
end_ct = 0
print "Endpoint Nodes"
endpoint = []
for src, dst in edge.iteritems():
if len(dst) != 2:
print len(dst), src, dst
endpoint.append(src)
end_ct += len(dst)
print end_ct, "road ends"
atlas = [] # List of roads, each a list of nodes
# Build roads between the identified endpoints
# Pick the first endpoint in the remaining list.
# Move to the first-listed adjacent node.
# Keep going until we hit another node on the endpoint list.
while len(endpoint) > 0:
here = endpoint[0]
# print "Road starting at", here, edge[here]
# Pick a first step and consume the edge
next = edge[here].pop(0)
edge[next].remove(here)
road = [here, next]
# If that's the last connection to the node, remove that node from the endpoints list.
if len(edge[here]) == 0:
del endpoint[0]
del edge[here]
# Special case for a one-segment road; endpoint entry of "next" is removed after the loop
if len(edge[next]) == 0:
del edge[next]
# Consume edges until we reach another endpoint.
debug = False
while next not in endpoint:
here = next
next = edge[here].pop(0)
edge[next].remove(here)
road.append(next)
if len(edge[next]) == 0:
del edge[next]
# print "removing node", next
if next not in edge:
endpoint.remove(next)
# print "removing endpoint", next
print "\nRoad from", road[0], "to", road[-1], ':\n\t', road
atlas.append(road)
print "\n", len(atlas), "roads built"
# print "edge dictionary still has entries:", len(edge)
根据 OP 编辑:
它有效,快速且正确,我发现它值得可视化:
import matplotlib.pyplot as plt
for road in atlas:
path = [nodesdict[i] for i in road]
lons, lats = zip(*path)
plt.plot(lons, lats)
plt.grid()
plt.axis('equal')
plt.show()