select 在二分网络中共享公共邻居的所有顶点对的有效方法

Efficient way to select all pairs of vertices that share common neighbors in a bipartite network

我需要 select 在二分网络中共享公共邻居的所有相同类型的顶点对

例如:

在这张图中,我有:(A,B)、(A,C)、(B,C)、(C,D)、(1,2) 和 (2,3)

我可以用二次解法求解:

import igraph
import time

def common_neighbors(adjlist, i, j):
    return len(adjlist[i].intersection(adjlist[j]))

def matching_1(graph, row, column):
    adjlist = map(set, graph.get_adjlist())
    matching = []
    dict_edges = dict()
    for i in range(row):
        for j in range(i+1, row):
            if common_neighbors(adjlist, i, j) > 0:
                matching.append((i, j))
    dict_edges = dict()
    for i in range(row, row+column):
        for j in range(i+1, row+column):
            if common_neighbors(adjlist, i, j) > 0:
                matching.append((i, j))
    return matching

def matching_2(graph, row, column):
    adjlist = map(set, graph.get_adjlist())
    matching = []
    for vertex in range(row):
        twohops = set((twohop for onehop in adjlist[vertex] for twohop in adjlist[onehop])) -set([vertex])
        for twohop in twohops:
            matching.append((vertex, twohop))
    for vertex in range(row, row+column):
        twohops = set((twohop for onehop in adjlist[vertex] for twohop in adjlist[onehop])) -set([vertex])
        for twohop in twohops:
            matching.append((vertex, twohop))
    return matching

if __name__ == "__main__":

    row, column = 500, 500
    graph = igraph.Graph.Full_Bipartite(row, column)

    tp_start = time.time()
    m = matching_1(graph, row, column)
    print "%.4f" % (time.time()-tp_start)

    tp_start = time.time()
    m = matching_2(graph, row, column)
    print "%.4f" % (time.time()-tp_start)

这是我能想到的最符合逻辑的方法。如果有人知道更有效的方法,我会洗耳恭听。任何帮助将不胜感激。

您基本上是在计算原始网络的两个二分投影的边列表。 igraph 有一个内置的方法,假设你的顶点有一个名为 type 的顶点属性,它描述了顶点属于图的哪一部分:

>>> g=Graph.Formula("A--1--B--2--C--3--D, A--2")
>>> g.vs["type"] = [name in "ABCD" for name in g.vs["name"]]
>>> g1, g2 = g.bipartite_projection()
>>> [tuple(g1.vs[pair]["name"]) for pair in g1.get_edgelist()]
[('1', '2'), ('2', '3')]
>>> [tuple(g2.vs[pair]["name"]) for pair in g2.get_edgelist()]
[('A', 'B'), ('A', 'C'), ('B', 'C'), ('C', 'D')]