如何使用 numpy 配对 (x,y) 对

Question

我有 2 个 2D NumPy 数组，每个数组由大约 300,000 (x,y) 对组成。给定数组 A 中的 i^th (x, y) 对，我需要找到对应的 j^th (x,y) 对数组 B 这样 ([x_i - x_j ]² + [y_i - y_j]²)^½，两个 (x,y) 对之间的距离最小化。

我目前正在做的是 argmin 像这样：

thickness = []
for i in range(len(A)):
    xi = A[i][0]
    yi = A[i][1]
    idx = (np.sqrt(np.power(B[:, 0] - xi, 2) + np.power(B[:, 1] - yi, 2))).argmin()
    thickness.append([(xi + B[idx][0]) / 2, (yi + B[idx][1]) / 2,
                      A[i][2] + B[idx][2]])

有没有更快的方法来完成这个？

Answer 1

您可以使用 numpy broadcast 来查找 B 中具有最小距离的索引，如下所示。广播距离是在A和B的每一行之间计算的

A = np.random.rand(100,3)
B = np.random.rand(90,3)

diff = A[:, np.newaxis, :2] - B[:,:2]
distance = np.sum(diff**2, axis=2)
indx = np.argmin(distance, axis=1)
result = (A+B[indx])/2
result

Answer 2

最快的解决方案将要求您根据数组 B 到原点的距离对数组 B 进行排序。您需要将数组 B 分成每个象限的 4 个 hastable。您将需要找到 A[Xi-Nx] [Yi-Ny] 的索引，该索引距离原点到我们的点 B(X,Y) 最近点的每个 quadrant.Most在 A 中到我们在 B 中的点将位于同一象限。对于最近点是不同象限的情况，您需要按照到当前象限的距离顺序遍历点。如果您只需要找到到一个点的距离，您可以根据到 A 中初始点的距离从最小到最大对数组 B 进行排序。这意味着最小距离将始终位于索引 0。此解决方案将运行的 N(LOG(N)) 复杂度，这基本上是排序的成本。

import numpy as np
def quadrant(point): 
    x= point[0]
    y = point [1]
    if (x > 0 and y > 0): 
        return 1

    elif (x < 0 and y > 0): 
        return 2

    elif (x < 0 and y < 0): 
        return 3

    elif (x > 0 and y < 0): 
       return 4



    else: 
       return 0


def calculateDistance(point1,point2):
    #calculates the distance sqr
    x1 = point1[0]  
    y1 = point1[1]
    x2 = point2[0]  
    y2 = point2[1]
    dist = ((x2 - x1)**2 + (y2 - y1)**2)  
    return dist  

A = np.random.rand(10,2)
B = np.random.rand(10,2)

#sort b By distance to the origin and put it in sub_Quadran array -- 1 array per quadran
origin = [0,0]     
q1= {}  
q2= {}  
q3= {}  
q4= {}    


def quadrant(point): 
    x= point[0]
    y = point [1]
    if (x > 0 and y > 0): 
       q1 [calculateDistance(origin,point)] = point


    elif (x < 0 and y > 0): 
        q2 [calculateDistance(origin,point)] = point

    elif (x < 0 and y < 0): 
      q3  [calculateDistance(origin,point)] = point

    elif (x > 0 and y < 0): 
       q4  [calculateDistance(origin,point)] = point



    else: 
       print ("Point is the origin")




def find_near(point):
#for now assume point is on first quadrant
  dist = calculateDistance(point,origin) # get  to origin distance sqr
  # =find the point with closses distance this can be more efficient 

  idx=  min(q1_keys, key=lambda x:abs(x-dist))
  return q1[idx]




for  point in B:
    quadrant(point)


q1_keys = list()
q2_keys = list()
q3_keys = list()
q4_keys = list()
for i in q1.keys():
    q1_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q1_keys.sort()

for i in q2.keys():
    q2_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q2_keys.sort()

for i in q3.keys():
    q3_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q3_keys.sort()

for i in q4.keys():
    q4_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q4_keys.sort()



b_1 = [0.01, 0.01]   

q1_keys.sort() #point sorted as closest to the origin
print (q1)   
print(q1_keys)   
print (find_near(b_1))

` 另一种方案是将数组B拆成多个子数组，在那些Bsub_arrays中求A中输入点的距离。请参阅下面 link 中的算法详细信息： https://courses.cs.washington.edu/courses/cse421/11su/slides/05dc.pdf

Answer 3

我们可以使用 Cython-powered kd-tree for quick nearest-neighbor lookup 来获得最近的邻居，从而实现我们想要的输出，就像这样 -

from scipy.spatial import cKDTree

idx = cKDTree(B[:,:2]).query(A[:,:2], k=1)[1]
thickness = [(A[:,0] + B[idx,0]) / 2, (A[:,1] + B[idx,1]) / 2, A[:,2] + B[idx,2]]

如何使用 numpy 配对 (x,y) 对

How to pair (x,y) pairs using numpy

python

arrays

search

numpy

bisection