在二进制 numpy 矩阵中将 1 的连续块翻转到一定大小

Question

我正在进行图像分析项目。我已经将我感兴趣的图片（一个 NxM numpy 数组）转换为二进制格式。矩阵中的“1”是感兴趣的区域。有感兴趣的区域，也有不能代表图像特征的噪声。例如，在图像的水平快照中，孤立的 1 或 2 的组，比如 5 个连续的 1，我不感兴趣。我想找到一种快速翻转这些的方法（即让它们 =0）。

我的 MWE 用于翻转孤立的 1：

import numpy as np
img = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

#now we take the second derivative of the matrix in the horizontal axis
#since we have a binary matrix, an isolated 1, that is [...010...] is captured
#by a second derivative entry equal to -2
#because ([...010...]->dx->[...1,-1,...]->dx->[...-2...]

ddx_img = np.diff(np.diff(img,1),1)
to_flip = np.where(ddx_img==-2) #returns a tuple of [x,y] matrix entries

# the second derivative eats up an index position on horizontally, so I need to add
# +1 to the horizontal axis of the tuple

temp_copy = to_flip[1].copy() #cannot modify tuple directly, for some reason its read only
temp_copy+=1
to_flip = (to_flip[0],temp_copy)

#now we can flip the entries by adding +1 to the entries to flip and taking mod 2
img[to_flip]=mod(img[to_flip]+1,2)

这在我的机器上大约需要 9 毫秒。我最多可以完成 1 秒的例程。

我欢迎对代码的任何批评（我不是一个好的 python 程序员），以及任何关于如何有效扩展此过程以消除连续 1 的孤岛到通用大小的孤岛的任何想法S.

提前致谢

编辑：我意识到 mod 是不必要的。在我这样做的时候，我还想翻转太小的 0 岛。可以将 =mod.. 替换为 =0

Answer 1

具体问题案例

编辑后，您似乎可以使用一些 slicing，从而避免制作中间副本以提高性能。这是实现所需输出的两行代码 -

# Calculate second derivative
ddx_img = np.diff(np.diff(img,1),1)

# Get sliced version of img excluding the first and last columns 
# and use mask with ddx elements as "-2" to zeros
img[:,1:-1][ddx_img==-2] = 0

运行时测试和验证结果 -

In [42]: A = np.random.choice([0,1],size=(1000,1000), p=[1./2,1./2])

In [43]: def slicing_based(A):
    ...:    img = A.copy()
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    img[:,1:-1][ddx_img==-2] = 0
    ...:    return img
    ...: 
    ...: 
    ...: def original_approach(A):
    ...: 
    ...:    img = A.copy()
    ...: 
    ...:    ddx_img = np.diff(np.diff(img,1),1)
    ...:    to_flip = np.where(ddx_img==-2)
    ...: 
    ...:    temp_copy = to_flip[1].copy()
    ...:    temp_copy+=1
    ...:    to_flip = (to_flip[0],temp_copy)
    ...: 
    ...:    img[to_flip] = 0
    ...: 
    ...:    return img
    ...: 

In [44]: %timeit slicing_based(A)
100 loops, best of 3: 15.3 ms per loop

In [45]: %timeit original_approach(A)
10 loops, best of 3: 20.1 ms per loop

In [46]: np.allclose(slicing_based(A),original_approach(A))
Out[46]: True

一般案例

为了使解决方案通用，可以使用一些信号处理，具体来说 2D convolution 如下所示 -

# Define kernel
K1 = np.array([[0,1,1,0]]) # Edit this for different island lengths
K2 = 1-K1

# Generate masks of same shape as img amd based on TRUE and inverted versions of 
# kernels being convolved and those convolved sums being compared against the 
# kernel sums indicating those spefic positions have fulfiled both the ONES 
# and ZEROS criteria
mask1 = convolve2d(img, K1, boundary='fill',fillvalue=0, mode='same')==K1.sum()
mask2 = convolve2d(img==0, K2, boundary='fill',fillvalue=0, mode='same')==K2.sum()

# Use a combined mask to create that expanses through the kernel length 
# and use it to set those in img to zeros
K3 = np.ones((1,K1.size))
mask3 = convolve2d(mask1 & mask2, K3, boundary='fill',fillvalue=0, mode='same')>0
img_out = img*(~mask3)

样本输入、输出-

In [250]: img
Out[250]: 
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 1, 1, 0, 1, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 1, 1, 0, 1, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])

In [251]: img_out
Out[251]: 
array([[0, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 0, 1, 0, 1],
       [1, 1, 0, 0, 0, 0, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 1],
       [1, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 1, 1, 1, 1]])

在二进制 numpy 矩阵中将 1 的连续块翻转到一定大小

Flipping continuous chunks of 1's up to a certain size in a binary numpy matrix

python

numpy

matrix

binary-data