使用 add_edge_list() 方法创建图形的最佳方法是什么？

Question

我正在尝试通过 graph-tool 库（接近 10^6 - 10^7 个顶点）创建大图并用顶点名称填充顶点属性或使用名称而不是顶点索引。我有：

姓名列表：
```
['50', '56', '568']
```

一组边，但它不是顶点索引，而是包含它们的名称：

edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}

因为 add_edge_list() 允许在图中没有这样的顶点时创建顶点。我正在尝试用它来填充一个空图。它工作正常，但是当我试图通过它的名字获取顶点时，我得到一个错误，没有顶点具有这样的索引。

这是我的程序代码：

g = grt.Graph(directed=False)
edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}
ids = ['50', '56', '568']
g.add_edge_list(edge_list, hashed=True, string_vals=True)
print(g.vertex('50'))

print(g.vertex('50'))的错误信息：

ValueError: Invalid vertex index: 50

我要创建图表：

仅使用 edge_list；
通过名称快速访问顶点；
按时间优化（如果可能，还有 RAM）。

有什么好的方法吗？

编辑：当前代码：

g = grt.Graph(directed=False)
g.add_vertex(len(ids))
vprop = g.new_vertex_property("string", vals=ids)
g.vp.user_id = vprop  
for vert1, vert2 in edges_list:
    g.add_edge(g.vertex(ids_dict[vert1]), g.vertex(ids_dict[vert2]))

Answer 1

如果你有一个包含 10^6 - 10^7 个顶点的密集图 （它是一些医学数据还是社交图？它可以改变一切），你不应该使用 networkx 因为它是在纯 Python 上编写的，所以它比 graph-tool 或 igraph 慢 ~10-100 倍。对于您的情况，我建议您使用 graph-tool。它是最快的 (~as igraph) Python 图形处理库。

graph-tool 行为不同于 networkx。当你创建 networkx 节点时，它的标识符是你在节点构造函数中写的，所以你可以通过它的 ID 来获取节点。在graph-tool中每个顶点ID都是从1到GRAPH_SIZE:

的整数

Each vertex in a graph has an unique index, which is always between 0 and N−1, where N is the number of vertices. This index can be obtained by using the vertex_index attribute of the graph (which is a property map, see Property maps), or by converting the vertex descriptor to an int.

关于图、顶点或边的所有附加信息都存储在property maps中。当您将 .add_edge_list() 与 hashed=True 一起使用时，新的属性映射将作为 .add_edge_list() 的结果返回。所以在你的情况下你应该像这样处理你的顶点：

# Create graph
g = grt.Graph(directed=False)

# Create edge list
# Why frozensets? You don't really need them. You can use ordinary sets or tuples
edge_list = {
    frozenset({'568', '56'}),
    frozenset({'56', '50'}),
    frozenset({'50', '568'})
}

# Write returned PropertyMap to a variable!
vertex_ids = g.add_edge_list(edge_list, hashed=True, string_vals=True)

g.vertex(1)

Out [...]: <Vertex object with index '1' at 0x7f3b5edde4b0>

vertex_ids[1]

Out [...]: '56'

如果你想根据ID获取一个顶点，你应该手动构造mapping dict（好吧，我不是graph-tool大师，但我找不到简单的解决方案）：

very_important_mapping_dict = {vertex_ids[i]: i for i in range(g.num_vertices())}

所以你可以很容易地得到一个顶点索引：

very_important_mapping_dict['568']

Out [...]: 0

vertex_ids[0]

Out [...]: '568'

使用 add_edge_list() 方法创建图形的最佳方法是什么？

What is the optimal way to create a graph with add_edge_list() method?

python

graph

networkx

python-3.x

graph-tool