einsum 和距离计算
einsum and distance calculations
我已经搜索了一种解决方案,使用 einsum 确定行数不相等但列数相等的 numpy 数组的距离。我尝试了各种组合,但唯一能成功的方法是使用以下代码。我显然遗漏了一些东西,文献和众多线程并没有让我更接近解决方案。我将很高兴找到一个通用性,使得对于任意数量的目标数组,起点可以是任意数量。我只使用二维数组,无意将其扩展到其他维度。我也熟悉 pdist 和 cdist 以及其他达到我想要的解决方案的方法,但是,我只对 einsum 感兴趣,因为我想完善我的示例库。任何帮助将不胜感激。
import numpy as np
origs = np.array([[0.,0.],[1.,0.],[0.,1.],[1.,1.]])
dests = np.asarray([[4.,0.],[1.,1.],[2.,2.],[2.,3.],[0.,5.]])
for i in origs:
d =np.sqrt(np.einsum("ij,ij->i", i-dests, i-dests))
print("orig {}...dist: {}".format(i,d))
下面的结果就是我要找的...
orig [ 0. 0.]...dist: [ 4. 1.41421356 2.82842712 3.60555128 5. ]
orig [ 1. 0.]...dist: [ 3. 1. 2.23606798 3.16227766 5.09901951]
orig [ 0. 1.]...dist: [ 4.12310563 1. 2.23606798 2.82842712 4. ]
orig [ 1. 1.]...dist: [ 3.16227766 0. 1.41421356 2.23606798 4.12310563]
如果我对问题的理解正确,那么在仅考虑二维数组时,您发布的 for 循环代码对我来说看起来很通用。现在,如果您希望通过一次调用 np.einsum
, you could bring in broadcasting
来获得一个通用的矢量化解决方案,就像这样 -
d_all = np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
样本运行-
In [85]: origs = np.array([[0.,0.],[1.,0.],[0.,1.],[1.,1.]])
...: dests = np.asarray([[4.,0.],[1.,1.],[2.,2.],[2.,3.],[0.,5.]])
...:
In [86]: for i in origs:
...: d =np.sqrt(np.einsum("ij,ij->i", i-dests, i-dests))
...: print(d)
...:
[ 4. 1.41421356 2.82842712 3.60555128 5. ]
[ 3. 1. 2.23606798 3.16227766 5.09901951]
[ 4.12310563 1. 2.23606798 2.82842712 4. ]
[ 3.16227766 0. 1.41421356 2.23606798 4.12310563]
In [87]: np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
Out[87]:
array([[ 4. , 1.41421356, 2.82842712, 3.60555128, 5. ],
[ 3. , 1. , 2.23606798, 3.16227766, 5.09901951],
[ 4.12310563, 1. , 2.23606798, 2.82842712, 4. ],
[ 3.16227766, 0. , 1.41421356, 2.23606798, 4.12310563]])
根据 ,您还可以使用 np.einsum
本身执行平方,就像这样 -
subts = origs[:,None,:] - dests
d_all = np.sqrt(np.einsum('ijk,ijk->ij',subts,subts))
这是一个 运行 时间测试,用于将其与之前 squaring
在 np.einsum
-
之外完成的方法进行比较
In [7]: def all_einsum(origs,dests):
...: subts = origs[:,None,:] - dests
...: return np.sqrt(np.einsum('ijk,ijk->ij',subts,subts))
...:
...: def partial_einsum(origs,dests):
...: return np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
...:
In [8]: origs = np.random.rand(400,100)
In [9]: dests = np.random.rand(500,100)
In [10]: %timeit all_einsum(origs,dests)
10 loops, best of 3: 139 ms per loop
In [11]: %timeit partial_einsum(origs,dests)
1 loops, best of 3: 251 ms per loop
我已经搜索了一种解决方案,使用 einsum 确定行数不相等但列数相等的 numpy 数组的距离。我尝试了各种组合,但唯一能成功的方法是使用以下代码。我显然遗漏了一些东西,文献和众多线程并没有让我更接近解决方案。我将很高兴找到一个通用性,使得对于任意数量的目标数组,起点可以是任意数量。我只使用二维数组,无意将其扩展到其他维度。我也熟悉 pdist 和 cdist 以及其他达到我想要的解决方案的方法,但是,我只对 einsum 感兴趣,因为我想完善我的示例库。任何帮助将不胜感激。
import numpy as np
origs = np.array([[0.,0.],[1.,0.],[0.,1.],[1.,1.]])
dests = np.asarray([[4.,0.],[1.,1.],[2.,2.],[2.,3.],[0.,5.]])
for i in origs:
d =np.sqrt(np.einsum("ij,ij->i", i-dests, i-dests))
print("orig {}...dist: {}".format(i,d))
下面的结果就是我要找的...
orig [ 0. 0.]...dist: [ 4. 1.41421356 2.82842712 3.60555128 5. ]
orig [ 1. 0.]...dist: [ 3. 1. 2.23606798 3.16227766 5.09901951]
orig [ 0. 1.]...dist: [ 4.12310563 1. 2.23606798 2.82842712 4. ]
orig [ 1. 1.]...dist: [ 3.16227766 0. 1.41421356 2.23606798 4.12310563]
如果我对问题的理解正确,那么在仅考虑二维数组时,您发布的 for 循环代码对我来说看起来很通用。现在,如果您希望通过一次调用 np.einsum
, you could bring in broadcasting
来获得一个通用的矢量化解决方案,就像这样 -
d_all = np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
样本运行-
In [85]: origs = np.array([[0.,0.],[1.,0.],[0.,1.],[1.,1.]])
...: dests = np.asarray([[4.,0.],[1.,1.],[2.,2.],[2.,3.],[0.,5.]])
...:
In [86]: for i in origs:
...: d =np.sqrt(np.einsum("ij,ij->i", i-dests, i-dests))
...: print(d)
...:
[ 4. 1.41421356 2.82842712 3.60555128 5. ]
[ 3. 1. 2.23606798 3.16227766 5.09901951]
[ 4.12310563 1. 2.23606798 2.82842712 4. ]
[ 3.16227766 0. 1.41421356 2.23606798 4.12310563]
In [87]: np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
Out[87]:
array([[ 4. , 1.41421356, 2.82842712, 3.60555128, 5. ],
[ 3. , 1. , 2.23606798, 3.16227766, 5.09901951],
[ 4.12310563, 1. , 2.23606798, 2.82842712, 4. ],
[ 3.16227766, 0. , 1.41421356, 2.23606798, 4.12310563]])
根据 np.einsum
本身执行平方,就像这样 -
subts = origs[:,None,:] - dests
d_all = np.sqrt(np.einsum('ijk,ijk->ij',subts,subts))
这是一个 运行 时间测试,用于将其与之前 squaring
在 np.einsum
-
In [7]: def all_einsum(origs,dests):
...: subts = origs[:,None,:] - dests
...: return np.sqrt(np.einsum('ijk,ijk->ij',subts,subts))
...:
...: def partial_einsum(origs,dests):
...: return np.sqrt(np.einsum('ijk->ij',(origs[:,None,:] - dests)**2))
...:
In [8]: origs = np.random.rand(400,100)
In [9]: dests = np.random.rand(500,100)
In [10]: %timeit all_einsum(origs,dests)
10 loops, best of 3: 139 ms per loop
In [11]: %timeit partial_einsum(origs,dests)
1 loops, best of 3: 251 ms per loop