多维数组/准图像的连通分量标记
Connected component labeling for arrays / quasi-images with many dimension
问题
我正尝试对 3 维以上的数组执行 connected component labling。我的意思是我的布尔数组有一个 .shape
例如像 (5,2,3,6,10)
这将是 5 个维度。
对于 2D 图像(而不是我的 >3D 问题),连接组件标记将标记连接区域(在我的例子中是超体积)。如果两个 (hpyer-) 像素彼此相邻并且在布尔数组中均为 True,则两个像素相连。
我已经尝试过的
对于 2 维这个 can be done with OpenCV and with up to 3 dimensions this can be done with scikit-image's skimage.measure.label
。但是,我不确定如何处理我的情况。
进一步material感兴趣的reader(但这对我的问题没有帮助):
如果 2D 中的 4 连通性就足够了,您可以使用最近邻树在 n log n 时间内获得也是前景的相邻像素。
然后是构建图形并找到连通分量的问题(也是 n log n,IIRC)。
#!/usr/bin/env python
"""
"""
import numpy as np
import networkx as nx
from scipy.spatial import cKDTree
def get_components(boolean_array):
# find neighbours
coordinates = list(zip(*np.where(boolean_array)))
tree = cKDTree(coordinates)
neighbours_by_pixel = tree.query_ball_tree(tree, r=1, p=1) # p=1 -> Manhatten distance; r=1 -> what would be 4-connectivity in 2D
# create graph and find components
G = nx.Graph()
for ii, neighbours in enumerate(neighbours_by_pixel):
if len(neighbours) > 1:
G.add_edges_from([(ii, jj) for jj in neighbours[1:]]) # skip first neighbour as that is a self-loop
components = nx.connected_components(G)
# create output image
output = np.zeros_like(data, dtype=np.int)
for ii, component in enumerate(components):
for idx in component:
output[coordinates[idx]] = ii+1
return output
if __name__ == '__main__':
shape = (5, 2, 3, 6, 10)
D = len(shape)
data = np.random.rand(*shape) < 0.1
output = get_components(data)
对于形状为 (50, 50, 50, 50) 的数组,我在笔记本电脑上得到以下计时:
In [48]: %timeit output = get_components(data)
5.85 s ± 279 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
scipy.ndimage.label
直接做你想做的事:
In [1]: import numpy as np
In [2]: arr = np.random.random((5,2,3,6,10)) > 0.5
In [3]: from scipy import ndimage as ndi
In [4]: labeled, n = ndi.label(arr)
In [5]: n
Out[5]: 11
问题
我正尝试对 3 维以上的数组执行 connected component labling。我的意思是我的布尔数组有一个 .shape
例如像 (5,2,3,6,10)
这将是 5 个维度。
对于 2D 图像(而不是我的 >3D 问题),连接组件标记将标记连接区域(在我的例子中是超体积)。如果两个 (hpyer-) 像素彼此相邻并且在布尔数组中均为 True,则两个像素相连。
我已经尝试过的
对于 2 维这个 can be done with OpenCV and with up to 3 dimensions this can be done with scikit-image's skimage.measure.label
。但是,我不确定如何处理我的情况。
进一步material感兴趣的reader(但这对我的问题没有帮助):
如果 2D 中的 4 连通性就足够了,您可以使用最近邻树在 n log n 时间内获得也是前景的相邻像素。 然后是构建图形并找到连通分量的问题(也是 n log n,IIRC)。
#!/usr/bin/env python
"""
"""
import numpy as np
import networkx as nx
from scipy.spatial import cKDTree
def get_components(boolean_array):
# find neighbours
coordinates = list(zip(*np.where(boolean_array)))
tree = cKDTree(coordinates)
neighbours_by_pixel = tree.query_ball_tree(tree, r=1, p=1) # p=1 -> Manhatten distance; r=1 -> what would be 4-connectivity in 2D
# create graph and find components
G = nx.Graph()
for ii, neighbours in enumerate(neighbours_by_pixel):
if len(neighbours) > 1:
G.add_edges_from([(ii, jj) for jj in neighbours[1:]]) # skip first neighbour as that is a self-loop
components = nx.connected_components(G)
# create output image
output = np.zeros_like(data, dtype=np.int)
for ii, component in enumerate(components):
for idx in component:
output[coordinates[idx]] = ii+1
return output
if __name__ == '__main__':
shape = (5, 2, 3, 6, 10)
D = len(shape)
data = np.random.rand(*shape) < 0.1
output = get_components(data)
对于形状为 (50, 50, 50, 50) 的数组,我在笔记本电脑上得到以下计时:
In [48]: %timeit output = get_components(data)
5.85 s ± 279 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
scipy.ndimage.label
直接做你想做的事:
In [1]: import numpy as np
In [2]: arr = np.random.random((5,2,3,6,10)) > 0.5
In [3]: from scipy import ndimage as ndi
In [4]: labeled, n = ndi.label(arr)
In [5]: n
Out[5]: 11