cKDTree 与 dsearchn
cKDTree vs dsearchn
我有两个数组 (A,B)
,包含:ID
、x
、y
、z
,点数相同但略有不同。
我想要一个数组,其中每一行都有两个数组的两个最近点的 ID x y z
。
目前我有这个:
import numpy as np
from scipy.spatial import cKDTree
A = np.loadtxt('A.txt')
B = np.loadtxt('B.txt')
tree = cKDTree( B[:,[1,2,3]] )
d, inds = tree.query( A[:,[1,2,3]], k=1, p=np.inf, eps=0.0)
A_new = A[inds]
xyz_near = np.hstack(( B[:,0:4], A_new[:,0:4] ))
但数组xyz_near
不包含正确的一对(IDB xB yB zB DIA xA yA zA):
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12589 18.0450 0.0781 -6.9756 7764 18.0461 0.0400 -5.9785
12590 18.0452 0.0779 -6.7281 7765 18.0464 0.0399 -5.7310
12591 18.0454 0.0777 -6.4805 7766 18.0467 0.0399 -5.4835
12592 18.0457 0.0775 -6.2329 7767 18.0470 0.0398 -5.2359
12593 18.0459 0.0773 -5.9852 7768 18.0473 0.0398 -4.9884
如您所见,前两行是正确的,但下两行不是。
如果我在 matlab 中用 dsearchn
(IDB xB yB zB DIA xA yA zA) 做同样的事情:
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12589 18.0450 0.0781 -6.9756 3420 18.0448 0.0402 -6.9688
12590 18.0452 0.0779 -6.7281 3419 18.0450 0.0401 -6.7212
12591 18.0454 0.0777 -6.4805 3418 18.0453 0.0401 -6.4737
12592 18.0457 0.0775 -6.2329 3417 18.0455 0.0400 -6.2261
12593 18.0459 0.0773 -5.9852 3416 18.0458 0.0400 -5.9785
没错。
我试图将 p
更改为 1、2 和 np.inf
,但这给出了相同的结果。
文件:
A.txt: http://pasted.co/8c5b6156
B.txt: http://pasted.co/28a228e6
谢谢
更新:
即使使用 ergo_ 建议的修复,我得到:
12587 18.0445 0.0784 -7.4705 7758 18.0448 0.0403 -7.4639
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12588 18.0447 0.0783 -7.2231 7759 18.0450 0.0402 -7.2163
12589 18.0450 0.0781 -6.9756 7760 18.0452 0.0402 -6.9688
12589 18.0450 0.0781 -6.9756 3420 18.0448 0.0402 -6.9688
12590 18.0452 0.0779 -6.7281 3419 18.0450 0.0401 -6.7212
12590 18.0452 0.0779 -6.7281 7761 18.0454 0.0401 -6.7212
12591 18.0454 0.0777 -6.4805 7762 18.0456 0.0401 -6.4736
12591 18.0454 0.0777 -6.4805 3418 18.0453 0.0401 -6.4737
所以它考虑了多次相同的点。
您可以验证 cKDTree 给出了正确的结果。
这里,对于问题"for each point in A, which point in B is the closest":
import numpy as np
from scipy.spatial import cKDTree
A = np.loadtxt('A.txt')
B = np.loadtxt('B.txt')
tree = cKDTree( B[:,[1,2,3]] )
d, inds = tree.query( A[:,[1,2,3]], k=1, p=2)
B_new = B[inds]
xyz_near = np.hstack(( B_new[:,0:4], A[:,0:4] ))
for j, a in enumerate(A):
# compute 2-norms from each point in B to a
dd = np.sqrt(((a[1:] - B[:,1:])**2).sum(axis=1))
# find closest point
jx = np.argmin(dd)
# check solution
assert inds[j] == jx
assert np.allclose(d[j], dd.min())
# check it is unique
assert (dd[jx+1:] > d[j]).all()
assert (dd[:jx] > d[j]).all()
print("All OK")
解法也很独特,如上图
如果另一方面你想要一对一的映射,这是一个不同的问题,在
finding nearest items across two lists/arrays in Python
但是,我认为 dsearchn
不会给你这个答案。
在没有三角测量的情况下使用 Octave 或 Matlab 的 dsearchn
可能会导致这行 numpy / python 代码:
def dsearchn(x,y):
"""
Implement Octave / Matlab dsearchn without triangulation
:param x: Search Points in
:param y: Were points are stored
:return: indices of points of x which have minimal distance to points of y
"""
IDX = []
for line in range(y.shape[0]):
distances = np.sqrt(np.sum(np.power(x - y[line, :], 2), axis=1))
found_min_dist_ind = (np.min(distances, axis=0) == distances)
length = found_min_dist_ind.shape[0]
IDX.append(np.array(range(length))[found_min_dist_ind][0])
return np.array(IDX)
试试这个代码。它产生与带三角测量的 MATLAB 方法 "dsearchn(P,T,PQ)" 相同的结果。
# xy=[[x1,y1]...[xm,ym]]
# XY=[[X1,Y1]...[Xm,Ym]]
tree = cKDTree(xy[:, 1:])
dd, ii = tree.query(XY, k=2, p=2, eps=0.0)
Z = []
for i in range(len(dd)):
min_dd = min(dd[i])
min_dd_idx = np.where(dd[i] == min_dd)[0]
if len(min_dd_idx) > 1:
sorted_ii = np.sort(ii[i][min_dd_idx])
Z.append(sorted_ii[len(min_dd_idx) - 1])
else:
Z.append(ii[i][0])
我有两个数组 (A,B)
,包含:ID
、x
、y
、z
,点数相同但略有不同。
我想要一个数组,其中每一行都有两个数组的两个最近点的 ID x y z
。
目前我有这个:
import numpy as np
from scipy.spatial import cKDTree
A = np.loadtxt('A.txt')
B = np.loadtxt('B.txt')
tree = cKDTree( B[:,[1,2,3]] )
d, inds = tree.query( A[:,[1,2,3]], k=1, p=np.inf, eps=0.0)
A_new = A[inds]
xyz_near = np.hstack(( B[:,0:4], A_new[:,0:4] ))
但数组xyz_near
不包含正确的一对(IDB xB yB zB DIA xA yA zA):
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12589 18.0450 0.0781 -6.9756 7764 18.0461 0.0400 -5.9785
12590 18.0452 0.0779 -6.7281 7765 18.0464 0.0399 -5.7310
12591 18.0454 0.0777 -6.4805 7766 18.0467 0.0399 -5.4835
12592 18.0457 0.0775 -6.2329 7767 18.0470 0.0398 -5.2359
12593 18.0459 0.0773 -5.9852 7768 18.0473 0.0398 -4.9884
如您所见,前两行是正确的,但下两行不是。
如果我在 matlab 中用 dsearchn
(IDB xB yB zB DIA xA yA zA) 做同样的事情:
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12589 18.0450 0.0781 -6.9756 3420 18.0448 0.0402 -6.9688
12590 18.0452 0.0779 -6.7281 3419 18.0450 0.0401 -6.7212
12591 18.0454 0.0777 -6.4805 3418 18.0453 0.0401 -6.4737
12592 18.0457 0.0775 -6.2329 3417 18.0455 0.0400 -6.2261
12593 18.0459 0.0773 -5.9852 3416 18.0458 0.0400 -5.9785
没错。
我试图将 p
更改为 1、2 和 np.inf
,但这给出了相同的结果。
文件:
A.txt: http://pasted.co/8c5b6156
B.txt: http://pasted.co/28a228e6
谢谢
更新: 即使使用 ergo_ 建议的修复,我得到:
12587 18.0445 0.0784 -7.4705 7758 18.0448 0.0403 -7.4639
12587 18.0445 0.0784 -7.4705 3422 18.0444 0.0403 -7.4639
12588 18.0447 0.0783 -7.2231 3421 18.0446 0.0402 -7.2164
12588 18.0447 0.0783 -7.2231 7759 18.0450 0.0402 -7.2163
12589 18.0450 0.0781 -6.9756 7760 18.0452 0.0402 -6.9688
12589 18.0450 0.0781 -6.9756 3420 18.0448 0.0402 -6.9688
12590 18.0452 0.0779 -6.7281 3419 18.0450 0.0401 -6.7212
12590 18.0452 0.0779 -6.7281 7761 18.0454 0.0401 -6.7212
12591 18.0454 0.0777 -6.4805 7762 18.0456 0.0401 -6.4736
12591 18.0454 0.0777 -6.4805 3418 18.0453 0.0401 -6.4737
所以它考虑了多次相同的点。
您可以验证 cKDTree 给出了正确的结果。 这里,对于问题"for each point in A, which point in B is the closest":
import numpy as np
from scipy.spatial import cKDTree
A = np.loadtxt('A.txt')
B = np.loadtxt('B.txt')
tree = cKDTree( B[:,[1,2,3]] )
d, inds = tree.query( A[:,[1,2,3]], k=1, p=2)
B_new = B[inds]
xyz_near = np.hstack(( B_new[:,0:4], A[:,0:4] ))
for j, a in enumerate(A):
# compute 2-norms from each point in B to a
dd = np.sqrt(((a[1:] - B[:,1:])**2).sum(axis=1))
# find closest point
jx = np.argmin(dd)
# check solution
assert inds[j] == jx
assert np.allclose(d[j], dd.min())
# check it is unique
assert (dd[jx+1:] > d[j]).all()
assert (dd[:jx] > d[j]).all()
print("All OK")
解法也很独特,如上图
如果另一方面你想要一对一的映射,这是一个不同的问题,在
finding nearest items across two lists/arrays in Python
但是,我认为 dsearchn
不会给你这个答案。
在没有三角测量的情况下使用 Octave 或 Matlab 的 dsearchn
可能会导致这行 numpy / python 代码:
def dsearchn(x,y):
"""
Implement Octave / Matlab dsearchn without triangulation
:param x: Search Points in
:param y: Were points are stored
:return: indices of points of x which have minimal distance to points of y
"""
IDX = []
for line in range(y.shape[0]):
distances = np.sqrt(np.sum(np.power(x - y[line, :], 2), axis=1))
found_min_dist_ind = (np.min(distances, axis=0) == distances)
length = found_min_dist_ind.shape[0]
IDX.append(np.array(range(length))[found_min_dist_ind][0])
return np.array(IDX)
试试这个代码。它产生与带三角测量的 MATLAB 方法 "dsearchn(P,T,PQ)" 相同的结果。
# xy=[[x1,y1]...[xm,ym]]
# XY=[[X1,Y1]...[Xm,Ym]]
tree = cKDTree(xy[:, 1:])
dd, ii = tree.query(XY, k=2, p=2, eps=0.0)
Z = []
for i in range(len(dd)):
min_dd = min(dd[i])
min_dd_idx = np.where(dd[i] == min_dd)[0]
if len(min_dd_idx) > 1:
sorted_ii = np.sort(ii[i][min_dd_idx])
Z.append(sorted_ii[len(min_dd_idx) - 1])
else:
Z.append(ii[i][0])