如何从 numpy 数组中删除重叠块?
How to remove overlapping blocks from numpy array?
我正在使用 cv2.goodFeaturesToTrack
函数来查找图像中的特征点。最终目标是提取一定大小的方块,特征点是这些块的中心。
然而,很多特征点彼此靠近,所以块重叠,这不是我想要的。
这是所有特征点(中心)的例子:
array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
假设我想找到具有 blockRadius = 400
且 不 重叠的第一个 n
块。关于如何实现这一点有什么想法吗?
你需要一些迭代来做到这一点,因为像这样的经常性辍学是不可向量化的。我认为这样的东西会起作用
from scipy.spatial.distance import pdist, squareform
c = np.array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
dists = squareform(pdist(c, metric = 'chebyshev')) # distance matrix, chebyshev here since you seem to want blocks
indices = np.arange(c.shape[0]) # indices that haven't been dropped (all to start)
out = [0] # always want the first index
while True:
try:
indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
out.append(indices[0]) # add the next index that hasn't been dropped to the output
except:
break # once you run out of indices, you'll get an IndexError and you're done
print(out)
[0, 1]
让我们尝试一大堆要点:
np.random.seed(42)
c = np.random.rand(10000, 2) * 800
dists = squareform(pdist(c, metric = 'chebyshev')) # distance matrix, checbyshev here since you seem to want squares
indices = np.arange(c.shape[0]) # indices that haven't been dropped (all to start)
out = [0] # always want the first index
while True:
try:
indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
out.append(indices[0]) # add the next index that hasn't been dropped to the output
except:
break # once you run out of indices, you'll get an IndexError and you're done
print(out, pdist(c[out], metric = 'chebyshev'))
[0, 2, 6, 17] [635.77582886 590.70015659 472.87353138 541.13920029 647.69071411
476.84658995]
因此,4 个点(有意义,因为 4 个 400x400 块平铺了 4 个平铺的 800x800 space),大部分值较低 (17 << 10000),并且保留点之间的距离始终 > 400
您可以使用 scipy.spatial.KDTree
更接近 - 尽管它不支持查询由块中不同数量的点组成的块。因此,它可以与另一个库 python-igraph
结合使用,该库可以快速找到接近点的连通分量:
from scipy.spatial import KDTree
import igraph as ig
data = np.array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
edges1 = KDTree(data[:,:1]).query_pairs(r=400)
edges2 = KDTree(data[:,1:]).query_pairs(r=400)
g = ig.Graph(n = len(data), edges=edges1 & edges2)
i = g.clusters()
因此簇对应于某种内部类型的块点索引序列igraph
。有一个快速预览:
>>> print(i)
Clustering with 8 elements and 2 clusters
[0] 0, 2, 3, 4, 5, 6
[1] 1, 7
>>> pal = ig.drawing.colors.ClusterColoringPalette(len(i)) #number of colors used
color = pal.get_many(i.membership) #list of color tags
ig.plot(g, bbox = (200, 100), layout=g.layout('circle'), vertex_label=g.vs.indices,
vertex_color = color, vertex_size = 12, vertex_label_size = 8)
用法示例:
>>> [data[n] for n in i] #or list(i)
[array([[3536., 1419.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.]]),
array([[2976., 1024.],
[3108., 737.]])]
备注: 此方法允许使用成对的接近点而不是 n*n
矩阵,后者更有效在某些情况下内存。
我正在使用 cv2.goodFeaturesToTrack
函数来查找图像中的特征点。最终目标是提取一定大小的方块,特征点是这些块的中心。
然而,很多特征点彼此靠近,所以块重叠,这不是我想要的。
这是所有特征点(中心)的例子:
array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
假设我想找到具有 blockRadius = 400
且 不 重叠的第一个 n
块。关于如何实现这一点有什么想法吗?
你需要一些迭代来做到这一点,因为像这样的经常性辍学是不可向量化的。我认为这样的东西会起作用
from scipy.spatial.distance import pdist, squareform
c = np.array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
dists = squareform(pdist(c, metric = 'chebyshev')) # distance matrix, chebyshev here since you seem to want blocks
indices = np.arange(c.shape[0]) # indices that haven't been dropped (all to start)
out = [0] # always want the first index
while True:
try:
indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
out.append(indices[0]) # add the next index that hasn't been dropped to the output
except:
break # once you run out of indices, you'll get an IndexError and you're done
print(out)
[0, 1]
让我们尝试一大堆要点:
np.random.seed(42)
c = np.random.rand(10000, 2) * 800
dists = squareform(pdist(c, metric = 'chebyshev')) # distance matrix, checbyshev here since you seem to want squares
indices = np.arange(c.shape[0]) # indices that haven't been dropped (all to start)
out = [0] # always want the first index
while True:
try:
indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
out.append(indices[0]) # add the next index that hasn't been dropped to the output
except:
break # once you run out of indices, you'll get an IndexError and you're done
print(out, pdist(c[out], metric = 'chebyshev'))
[0, 2, 6, 17] [635.77582886 590.70015659 472.87353138 541.13920029 647.69071411
476.84658995]
因此,4 个点(有意义,因为 4 个 400x400 块平铺了 4 个平铺的 800x800 space),大部分值较低 (17 << 10000),并且保留点之间的距离始终 > 400
您可以使用 scipy.spatial.KDTree
更接近 - 尽管它不支持查询由块中不同数量的点组成的块。因此,它可以与另一个库 python-igraph
结合使用,该库可以快速找到接近点的连通分量:
from scipy.spatial import KDTree
import igraph as ig
data = np.array([[3536., 1419.],
[2976., 1024.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.],
[3108., 737.]])
edges1 = KDTree(data[:,:1]).query_pairs(r=400)
edges2 = KDTree(data[:,1:]).query_pairs(r=400)
g = ig.Graph(n = len(data), edges=edges1 & edges2)
i = g.clusters()
因此簇对应于某种内部类型的块点索引序列igraph
。有一个快速预览:
>>> print(i)
Clustering with 8 elements and 2 clusters
[0] 0, 2, 3, 4, 5, 6
[1] 1, 7
>>> pal = ig.drawing.colors.ClusterColoringPalette(len(i)) #number of colors used
color = pal.get_many(i.membership) #list of color tags
ig.plot(g, bbox = (200, 100), layout=g.layout('circle'), vertex_label=g.vs.indices,
vertex_color = color, vertex_size = 12, vertex_label_size = 8)
用法示例:
>>> [data[n] for n in i] #or list(i)
[array([[3536., 1419.],
[3504., 1400.],
[3574., 1505.],
[3672., 1453.],
[3671., 1442.],
[3489., 1429.]]),
array([[2976., 1024.],
[3108., 737.]])]
备注: 此方法允许使用成对的接近点而不是 n*n
矩阵,后者更有效在某些情况下内存。