Numpy-从二维数组中获取邻居矩阵
Numpy- get neighbors matrix from 2D array
我有一个无向图:
A,B
A,C
A,F
B,D
C,D
C,F
E,D
E,F
我需要将其转换为矩阵 N,其中
N[x,y] = 1 if x is neighbor of y (like A and B) and 0 if not
最快的方法是什么?
NOTA 该图存储为 numpy 字符串数组:
array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']],
dtype='|S4')
预期输出:
A B C D E F
A 1 1 1 0 0 1
B 1 1 0 1 0 0
C 1 0 1 1 0 1
D 0 1 1 1 1 0
E 0 0 0 1 1 1
F 1 0 1 0 1 1
谢谢
使用networkx:
A=nx.Graph()
for x in a:
A.add_edge(x[0],x[1])
ad =nx.adjacency_matrix(A)
给予
matrix([[ 0., 1., 1., 0., 0., 1.],
[ 1., 0., 0., 0., 1., 1.],
[ 1., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 1.],
[ 0., 1., 1., 1., 0., 0.],
[ 1., 1., 0., 1., 0., 0.]])
建议fill_diagonal
获取您预期输出中的自链接:
np.fill_diagonal(ad,1)
matrix([[ 1., 1., 1., 0., 0., 1.],
[ 1., 1., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 1.],
[ 0., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 1., 1., 1.],
[ 1., 0., 1., 0., 1., 1.]])
纯粹的numpy
我会做类似
的事情
import numpy as np
data = np.array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']], dtype='|S4')
# define vectorized mapping from letters to integers
ord_u = np.vectorize(lambda x: ord(x)-65)
# map entities of data to integer
s = ord_u(data)
# initialize the matrix
res = np.zeros((s.max()+1,s.max()+1))
# use advanced indexing for calculating the matrix
res[s[:,0],s[:,1]] = 1
# symmetrize the matrix
res_sym = res + res.T
np.fill_diagonal(res_sym,1)
res_sym
#array([[ 1., 1., 1., 0., 0., 1.],
# [ 1., 1., 0., 1., 0., 0.],
# [ 1., 0., 1., 1., 0., 1.],
# [ 0., 1., 1., 1., 1., 0.],
# [ 0., 0., 0., 1., 1., 1.],
# [ 1., 0., 1., 0., 1., 1.]])
这是一种 NumPythonic 方法 -
# Tag each string with a numeric ID based on the uniqueness among other strings
_,ID = np.unique(graph,return_inverse=True)
M = ID.reshape(graph.shape)
# Consider each row of numeric IDs as an indexing tuple of a 2D array.
# Calculate the linear indices corresponding to each tuple.
n = M.max()+1
idx1 = M[:,0]*(n) + M[:,1]
idx2 = M[:,1]*(n) + M[:,0]
# Setup output array with 1s on the diagonal.
out = np.eye(n,dtype=int)
# Put 1s at places specified by the earlier computed pairs of linear indices
np.put(out,[idx1,idx2],1)
样本输入、输出-
In [93]: graph
Out[93]:
array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']],
dtype='|S4')
In [94]: out
Out[94]:
array([[1, 1, 1, 0, 0, 1],
[1, 1, 0, 1, 0, 0],
[1, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 1],
[1, 0, 1, 0, 1, 1]])
运行时测试
本节基本上比较了迄今为止列出的所有方法(在此 post 和其他 post 中)解决案例的性能。
定义函数-
def networkx_based(a): #@atomh33ls's solution code
A=nx.Graph()
for x in a:
A.add_edge(x[0],x[1])
return nx.adjacency_matrix(A)
def vectorize_based(data): # @plonser's solution code
ord_u = np.vectorize(lambda x: ord(x)-65)
s = ord_u(data)
res = np.zeros((s.max()+1,s.max()+1),dtype=int)
res[s[:,0],s[:,1]] = 1
res_sym = res + res.T
np.fill_diagonal(res_sym,1)
return res_sym
def unique_based(graph): #solution from this post
_,ID = np.unique(graph,return_inverse=True)
M = ID.reshape(graph.shape)
n = M.max()+1
idx1 = M[:,0]*(n) + M[:,1]
idx2 = M[:,1]*(n) + M[:,0]
out = np.eye(n,dtype=int)
np.put(out,[idx1,idx2],1)
return out
计时 -
In [321]: N = 1000
In [322]: arr = np.random.randint(65,90,(N,2))
In [323]: graph = np.array([map(chr,a) for a in arr],dtype='S4')
In [324]: %timeit networkx_based(graph)
100 loops, best of 3: 3.79 ms per loop
In [325]: %timeit vectorize_based(graph)
1000 loops, best of 3: 760 µs per loop
In [326]: %timeit unique_based(graph)
1000 loops, best of 3: 405 µs per loop
我有一个无向图:
A,B
A,C
A,F
B,D
C,D
C,F
E,D
E,F
我需要将其转换为矩阵 N,其中
N[x,y] = 1 if x is neighbor of y (like A and B) and 0 if not
最快的方法是什么?
NOTA 该图存储为 numpy 字符串数组:
array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']],
dtype='|S4')
预期输出:
A B C D E F
A 1 1 1 0 0 1
B 1 1 0 1 0 0
C 1 0 1 1 0 1
D 0 1 1 1 1 0
E 0 0 0 1 1 1
F 1 0 1 0 1 1
谢谢
使用networkx:
A=nx.Graph()
for x in a:
A.add_edge(x[0],x[1])
ad =nx.adjacency_matrix(A)
给予
matrix([[ 0., 1., 1., 0., 0., 1.],
[ 1., 0., 0., 0., 1., 1.],
[ 1., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 1., 1.],
[ 0., 1., 1., 1., 0., 0.],
[ 1., 1., 0., 1., 0., 0.]])
建议fill_diagonal
获取您预期输出中的自链接:
np.fill_diagonal(ad,1)
matrix([[ 1., 1., 1., 0., 0., 1.],
[ 1., 1., 0., 1., 0., 0.],
[ 1., 0., 1., 1., 0., 1.],
[ 0., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 1., 1., 1.],
[ 1., 0., 1., 0., 1., 1.]])
纯粹的numpy
我会做类似
import numpy as np
data = np.array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']], dtype='|S4')
# define vectorized mapping from letters to integers
ord_u = np.vectorize(lambda x: ord(x)-65)
# map entities of data to integer
s = ord_u(data)
# initialize the matrix
res = np.zeros((s.max()+1,s.max()+1))
# use advanced indexing for calculating the matrix
res[s[:,0],s[:,1]] = 1
# symmetrize the matrix
res_sym = res + res.T
np.fill_diagonal(res_sym,1)
res_sym
#array([[ 1., 1., 1., 0., 0., 1.],
# [ 1., 1., 0., 1., 0., 0.],
# [ 1., 0., 1., 1., 0., 1.],
# [ 0., 1., 1., 1., 1., 0.],
# [ 0., 0., 0., 1., 1., 1.],
# [ 1., 0., 1., 0., 1., 1.]])
这是一种 NumPythonic 方法 -
# Tag each string with a numeric ID based on the uniqueness among other strings
_,ID = np.unique(graph,return_inverse=True)
M = ID.reshape(graph.shape)
# Consider each row of numeric IDs as an indexing tuple of a 2D array.
# Calculate the linear indices corresponding to each tuple.
n = M.max()+1
idx1 = M[:,0]*(n) + M[:,1]
idx2 = M[:,1]*(n) + M[:,0]
# Setup output array with 1s on the diagonal.
out = np.eye(n,dtype=int)
# Put 1s at places specified by the earlier computed pairs of linear indices
np.put(out,[idx1,idx2],1)
样本输入、输出-
In [93]: graph
Out[93]:
array([['A', 'B'],
['A', 'C'],
['A', 'F'],
['B', 'D'],
['C', 'D'],
['C', 'F'],
['E', 'D'],
['E', 'F']],
dtype='|S4')
In [94]: out
Out[94]:
array([[1, 1, 1, 0, 0, 1],
[1, 1, 0, 1, 0, 0],
[1, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 1],
[1, 0, 1, 0, 1, 1]])
运行时测试
本节基本上比较了迄今为止列出的所有方法(在此 post 和其他 post 中)解决案例的性能。
定义函数-
def networkx_based(a): #@atomh33ls's solution code
A=nx.Graph()
for x in a:
A.add_edge(x[0],x[1])
return nx.adjacency_matrix(A)
def vectorize_based(data): # @plonser's solution code
ord_u = np.vectorize(lambda x: ord(x)-65)
s = ord_u(data)
res = np.zeros((s.max()+1,s.max()+1),dtype=int)
res[s[:,0],s[:,1]] = 1
res_sym = res + res.T
np.fill_diagonal(res_sym,1)
return res_sym
def unique_based(graph): #solution from this post
_,ID = np.unique(graph,return_inverse=True)
M = ID.reshape(graph.shape)
n = M.max()+1
idx1 = M[:,0]*(n) + M[:,1]
idx2 = M[:,1]*(n) + M[:,0]
out = np.eye(n,dtype=int)
np.put(out,[idx1,idx2],1)
return out
计时 -
In [321]: N = 1000
In [322]: arr = np.random.randint(65,90,(N,2))
In [323]: graph = np.array([map(chr,a) for a in arr],dtype='S4')
In [324]: %timeit networkx_based(graph)
100 loops, best of 3: 3.79 ms per loop
In [325]: %timeit vectorize_based(graph)
1000 loops, best of 3: 760 µs per loop
In [326]: %timeit unique_based(graph)
1000 loops, best of 3: 405 µs per loop