如何使用 numpy 配对 (x,y) 对
How to pair (x,y) pairs using numpy
我有 2 个 2D NumPy 数组,每个数组由大约 300,000 (x,y) 对组成。给定数组 A 中的 ith (x, y) 对,我需要找到对应的 jth (x,y) 对数组 B 这样 ([xi - xj ]2 + [yi - yj]2)½,两个 (x,y) 对之间的距离最小化。
我目前正在做的是 argmin
像这样:
thickness = []
for i in range(len(A)):
xi = A[i][0]
yi = A[i][1]
idx = (np.sqrt(np.power(B[:, 0] - xi, 2) + np.power(B[:, 1] - yi, 2))).argmin()
thickness.append([(xi + B[idx][0]) / 2, (yi + B[idx][1]) / 2,
A[i][2] + B[idx][2]])
有没有更快的方法来完成这个?
您可以使用 numpy broadcast 来查找 B 中具有最小距离的索引,如下所示。广播距离是在A和B的每一行之间计算的
A = np.random.rand(100,3)
B = np.random.rand(90,3)
diff = A[:, np.newaxis, :2] - B[:,:2]
distance = np.sum(diff**2, axis=2)
indx = np.argmin(distance, axis=1)
result = (A+B[indx])/2
result
最快的解决方案将要求您根据数组 B 到原点的距离对数组 B 进行排序。您需要将数组 B 分成每个象限的 4 个 hastable。
您将需要找到 A[Xi-Nx] [Yi-Ny] 的索引,该索引距离原点到我们的点 B(X,Y) 最近点的每个 quadrant.Most在 A 中到我们在 B 中的点将位于同一象限。对于最近点是不同象限的情况,您需要按照到当前象限的距离顺序遍历点。如果您只需要找到到一个点的距离,您可以根据到 A 中初始点的距离从最小到最大对数组 B 进行排序。这意味着最小距离将始终位于索引 0。此解决方案将 运行 的 N(LOG(N)) 复杂度,这基本上是排序的成本。
import numpy as np
def quadrant(point):
x= point[0]
y = point [1]
if (x > 0 and y > 0):
return 1
elif (x < 0 and y > 0):
return 2
elif (x < 0 and y < 0):
return 3
elif (x > 0 and y < 0):
return 4
else:
return 0
def calculateDistance(point1,point2):
#calculates the distance sqr
x1 = point1[0]
y1 = point1[1]
x2 = point2[0]
y2 = point2[1]
dist = ((x2 - x1)**2 + (y2 - y1)**2)
return dist
A = np.random.rand(10,2)
B = np.random.rand(10,2)
#sort b By distance to the origin and put it in sub_Quadran array -- 1 array per quadran
origin = [0,0]
q1= {}
q2= {}
q3= {}
q4= {}
def quadrant(point):
x= point[0]
y = point [1]
if (x > 0 and y > 0):
q1 [calculateDistance(origin,point)] = point
elif (x < 0 and y > 0):
q2 [calculateDistance(origin,point)] = point
elif (x < 0 and y < 0):
q3 [calculateDistance(origin,point)] = point
elif (x > 0 and y < 0):
q4 [calculateDistance(origin,point)] = point
else:
print ("Point is the origin")
def find_near(point):
#for now assume point is on first quadrant
dist = calculateDistance(point,origin) # get to origin distance sqr
# =find the point with closses distance this can be more efficient
idx= min(q1_keys, key=lambda x:abs(x-dist))
return q1[idx]
for point in B:
quadrant(point)
q1_keys = list()
q2_keys = list()
q3_keys = list()
q4_keys = list()
for i in q1.keys():
q1_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q1_keys.sort()
for i in q2.keys():
q2_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q2_keys.sort()
for i in q3.keys():
q3_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q3_keys.sort()
for i in q4.keys():
q4_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q4_keys.sort()
b_1 = [0.01, 0.01]
q1_keys.sort() #point sorted as closest to the origin
print (q1)
print(q1_keys)
print (find_near(b_1))
`
另一种方案是将数组B拆成多个子数组,在那些Bsub_arrays中求A中输入点的距离。请参阅下面 link 中的算法详细信息:
https://courses.cs.washington.edu/courses/cse421/11su/slides/05dc.pdf
我们可以使用 Cython-powered kd-tree
for quick nearest-neighbor lookup 来获得最近的邻居,从而实现我们想要的输出,就像这样 -
from scipy.spatial import cKDTree
idx = cKDTree(B[:,:2]).query(A[:,:2], k=1)[1]
thickness = [(A[:,0] + B[idx,0]) / 2, (A[:,1] + B[idx,1]) / 2, A[:,2] + B[idx,2]]
我有 2 个 2D NumPy 数组,每个数组由大约 300,000 (x,y) 对组成。给定数组 A 中的 ith (x, y) 对,我需要找到对应的 jth (x,y) 对数组 B 这样 ([xi - xj ]2 + [yi - yj]2)½,两个 (x,y) 对之间的距离最小化。
我目前正在做的是 argmin
像这样:
thickness = []
for i in range(len(A)):
xi = A[i][0]
yi = A[i][1]
idx = (np.sqrt(np.power(B[:, 0] - xi, 2) + np.power(B[:, 1] - yi, 2))).argmin()
thickness.append([(xi + B[idx][0]) / 2, (yi + B[idx][1]) / 2,
A[i][2] + B[idx][2]])
有没有更快的方法来完成这个?
您可以使用 numpy broadcast 来查找 B 中具有最小距离的索引,如下所示。广播距离是在A和B的每一行之间计算的
A = np.random.rand(100,3)
B = np.random.rand(90,3)
diff = A[:, np.newaxis, :2] - B[:,:2]
distance = np.sum(diff**2, axis=2)
indx = np.argmin(distance, axis=1)
result = (A+B[indx])/2
result
最快的解决方案将要求您根据数组 B 到原点的距离对数组 B 进行排序。您需要将数组 B 分成每个象限的 4 个 hastable。 您将需要找到 A[Xi-Nx] [Yi-Ny] 的索引,该索引距离原点到我们的点 B(X,Y) 最近点的每个 quadrant.Most在 A 中到我们在 B 中的点将位于同一象限。对于最近点是不同象限的情况,您需要按照到当前象限的距离顺序遍历点。如果您只需要找到到一个点的距离,您可以根据到 A 中初始点的距离从最小到最大对数组 B 进行排序。这意味着最小距离将始终位于索引 0。此解决方案将 运行 的 N(LOG(N)) 复杂度,这基本上是排序的成本。
import numpy as np
def quadrant(point):
x= point[0]
y = point [1]
if (x > 0 and y > 0):
return 1
elif (x < 0 and y > 0):
return 2
elif (x < 0 and y < 0):
return 3
elif (x > 0 and y < 0):
return 4
else:
return 0
def calculateDistance(point1,point2):
#calculates the distance sqr
x1 = point1[0]
y1 = point1[1]
x2 = point2[0]
y2 = point2[1]
dist = ((x2 - x1)**2 + (y2 - y1)**2)
return dist
A = np.random.rand(10,2)
B = np.random.rand(10,2)
#sort b By distance to the origin and put it in sub_Quadran array -- 1 array per quadran
origin = [0,0]
q1= {}
q2= {}
q3= {}
q4= {}
def quadrant(point):
x= point[0]
y = point [1]
if (x > 0 and y > 0):
q1 [calculateDistance(origin,point)] = point
elif (x < 0 and y > 0):
q2 [calculateDistance(origin,point)] = point
elif (x < 0 and y < 0):
q3 [calculateDistance(origin,point)] = point
elif (x > 0 and y < 0):
q4 [calculateDistance(origin,point)] = point
else:
print ("Point is the origin")
def find_near(point):
#for now assume point is on first quadrant
dist = calculateDistance(point,origin) # get to origin distance sqr
# =find the point with closses distance this can be more efficient
idx= min(q1_keys, key=lambda x:abs(x-dist))
return q1[idx]
for point in B:
quadrant(point)
q1_keys = list()
q2_keys = list()
q3_keys = list()
q4_keys = list()
for i in q1.keys():
q1_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q1_keys.sort()
for i in q2.keys():
q2_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q2_keys.sort()
for i in q3.keys():
q3_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q3_keys.sort()
for i in q4.keys():
q4_keys.append(i)
#Sorting can be done in the step bellow but done here to speedpup implementation
q4_keys.sort()
b_1 = [0.01, 0.01]
q1_keys.sort() #point sorted as closest to the origin
print (q1)
print(q1_keys)
print (find_near(b_1))
` 另一种方案是将数组B拆成多个子数组,在那些Bsub_arrays中求A中输入点的距离。请参阅下面 link 中的算法详细信息: https://courses.cs.washington.edu/courses/cse421/11su/slides/05dc.pdf
我们可以使用 Cython-powered kd-tree
for quick nearest-neighbor lookup 来获得最近的邻居,从而实现我们想要的输出,就像这样 -
from scipy.spatial import cKDTree
idx = cKDTree(B[:,:2]).query(A[:,:2], k=1)[1]
thickness = [(A[:,0] + B[idx,0]) / 2, (A[:,1] + B[idx,1]) / 2, A[:,2] + B[idx,2]]