通过避免 for 循环从节点列表构建邻接矩阵

Question

我需要解决什么？

从索引列表构建二进制矩阵。这是我如何进行，但我想获得一种有效的方法来避免循环

输入：

list_indices =[
[0,3,4],
[2,1,0],
[3,5]
]

预期输出：

results=[
[0,1,1,1,1,0],
[1,0,1,0,0,0],
[0,0,0,0,0,0],
[1,0,0,0,1,1],
[1,0,0,1,0,0],
[0,0,1,0,0,0],
]

结果对应于从索引列表构造的二进制邻接（对称）矩阵。 results中的1对应的是属于list_indices.

同一行的一对索引

来自list_indices的对是：

row 1 : (0,3), (3,0), (0,4),(4,0), (3,4), (4,3)
row 2 : (0,1), (1,0), (2,0), (0,2),(1,2), (2,1)
row 3 : (3,5), (5,3)


number of column and number of rows in results = np.max(list_indices)+1=6

我试过什么？

results=np.zeros((np.max(list_indices)+1,np.max(list_indices)+1))

for pair in itertools.combinations(list_indices, r=2) :
                      
         results[pair[0],pair[1]]=results[pair[1],pair[0]]=1.0

构建它的有效方法是什么？ （避免循环）

itertools.combinations returns 用于填充矩阵结果的对列表。由于矩阵是对称的，itertools.combinations 提供了对应于上对角矩阵的对列表。对角线设置为零

Answer 1

这个问题与我 10 天前的调查密切相关，所以我将 post 在这里总结最重要的发现。

将社区存储为长度不平衡的列表会强制使用效率不高的迭代或串联。相反，您可以使用单个数组并像这样计数：
```
    flow = [0,3,4,2,1,0,3,5]
    counts = [3,3,2]
```

单个组合的更快方法是 np.triu_indices 方法而不是 itertools.combinations:

def combs(t):
     x, y = np.triu_indices(len(t), 1)
     return np.transpose([t[x], t[y]])

我的解决方案中指出您正在寻找如何避免连接和列表理解：

np.concatenate([combs(n) for n in list_indices])

或者，或者 (from itertools import*):

np.array(list(chain(*[combinations(n,2) for n in list_indices])))

我找到了几种根据您的输入对其进行矢量化的方法：

def repeat_blocks(a, sizes, repeats):
    #Thanks @Divakar: 
    r1 = np.repeat(np.arange(len(sizes)), repeats)
    N = (sizes*repeats).sum() # or np.dot(sizes, repeats)
    id_ar = np.ones(N, dtype=int)
    id_ar[0] = 0
    insert_index = sizes[r1[:-1]].cumsum()
    insert_val = (1-sizes)[r1[:-1]]
    insert_val[r1[1:] != r1[:-1]] = 1
    id_ar[insert_index] = insert_val
    out = a[id_ar.cumsum()] 
    return out

def concat_combs1(flow, counts):
    #way 1 based on repetition of blocks of consecutive items
    col1 = repeat_blocks(flow, counts, counts)
    col2 = np.repeat(flow, np.repeat(counts, counts))
    return np.transpose([col1, col2])[col1 < col2]

def concat_combs2(targets, counts):
    #way 2 based on repetition of blocks dissociated from each other
    counts = list(map(len, targets))
    col1 = np.concatenate(np.repeat(targets, counts, axis=0))
    col2 = np.repeat(np.concatenate(targets), np.repeat(counts, counts))
    return np.transpose([col1, col2])[col1 < col2]

测试：

list_indices = [np.array([0,3,4]), np.array([2,1,0]), np.array([3,5])]
flow = np.array([0,3,4,2,1,0,3,5])
counts = np.array([3, 3, 2])
# Usage:
np.concatenate([combs(n) for n in list_indices])
concat_combs1(flow, counts)
concat_combs2(list_indices)

输出：

array([[0, 3],
       [0, 4],
       [3, 4],
       [1, 2],
       [0, 2],
       [0, 1],
       [3, 5]])

结论

perfplotigraph.Graph.Barabasi(n = x, m = 3) 上有四种方法，包括 itertools.combinations 和 np.triu_indices。该图的每个顶点平均有 3 个邻居。总之，效果最好。这次 numpy 数组的串联比链接组合慢，因为正在串联非常多的小列表。

最终解决方案

为了以最快的方式构建关联矩阵，您需要应用 concat_combs1 方法的一个小变体：

flow = np.array([0,3,4,2,1,0,3,5])
counts = np.array([3,3,2])
results = np.zeros((np.max(flow)+1, np.max(flow)+1), dtype=int)
col1 = repeat_blocks(flow, counts, counts)
col2 = np.repeat(flow, np.repeat(counts, counts))
results[col1, col2] = 1
np.fill_diagonal(results, 0)

输出

[[0 1 1 1 1 0]
 [1 0 1 0 0 0]
 [1 1 0 0 0 0]
 [1 0 0 0 1 1]
 [1 0 0 1 0 0]
 [0 0 0 1 0 0]]

通过避免 for 循环从节点列表构建邻接矩阵

Build adjacency matrix from a list of nodes by avoiding for loops

python

numpy

list

itertools

adjacency-matrix

结论

最终解决方案

输出