如何在 numpy 中快速检查数组的每个单元格是否具有指定值的邻居?

How would I check if each cell of an array has neighbors of a specified value quickly in numpy?

假设我有一个像

这样的数组
    np.array([[0,0,0,1,0],
               [0,0,0,0,0],
               [0,1,0,0,0],
               [0,0,0,1,0],
               [0,0,0,0,0]],dtype=bool)

并且我想要一个包含所有值的布尔数组,其中相邻单元格在该数组中不为 0,例如:

    np.array([[0,0,1,0,1],
               [1,1,1,1,1],
               [1,0,1,1,1],
               [1,1,1,0,1],
               [0,0,1,1,1]],dtype=bool)

如果不在 python 循环中遍历所有内容(因为那真的很慢),我将如何做到这一点?

您可以使用滑动 window 在 window 中取最大值。

def foo(arr, window):
    r, c = arr.shape
    wr, wc = window
    ans = arr * 0
    for i in range(r):
        for j in range(c):
            if not arr[i, j]:                
                ans[i, j] = arr[max(i - wr, 0):min(i + wr + 1, r), max(j - wc, 0):min(j + wc + 1, c)].max()
            else:
                ans[i, j] = 0
        
    return ans

data = np.array([[0,0,0,1,0],
                 [0,0,0,0,0],
                 [0,1,0,0,0],
                 [0,0,0,1,0],
                 [0,0,0,0,0]])
foo(data, [1, 1])
# array([[0, 0, 1, 0, 1],
#        [1, 1, 1, 1, 1],
#        [1, 0, 1, 1, 1],
#        [1, 1, 1, 0, 1],
#        [0, 0, 1, 1, 1]])

from scipy.ndimage import maximum_filter

ans = maximum_filter(data, size=(3, 3))
ans[data == 1] = 0
ans
# array([[0, 0, 1, 0, 1],
#        [1, 1, 1, 1, 1],
#        [1, 0, 1, 1, 1],
#        [1, 1, 1, 0, 1],
#        [0, 0, 1, 1, 1]])

如果只想用numpy,我的方法是在原数组中找出所有真值的邻居,计算方法是判断是否切比雪夫距离(L-infinite distance) 元素在数组中的位置与真值位置之间的距离为1,然后用逻辑或运算合并:

>>> ar = np.array([[0,0,0,1,0],
... [0,0,0,0,0],
... [0,1,0,0,0],
... [0,0,0,1,0],
... [0,0,0,0,0]], bool)
>>> row, col = ar.nonzero()
>>> rows, cols = np.indices(ar.shape)
>>> np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
array([[False, False,  True, False,  True],
       [ True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True],
       [ True,  True,  True, False,  True],
       [False, False,  True,  True,  True]])

通过广播,还可以避免列表理解:

>>> rows, cols = np.indices(ar.shape, sparse=True)  # Setting to sparse does not affect the calculation.
>>> i = np.abs(rows[None] - row[:, None, None])
>>> j = np.abs(cols[None] - col[:, None, None])
>>> (np.maximum(i, j) == 1).any(0)
array([[False, False,  True, False,  True],
       [ True,  True,  True,  True,  True],
       [ True, False,  True,  True,  True],
       [ True,  True,  True, False,  True],
       [False, False,  True,  True,  True]])

为了让代码看起来更短,我使用了None而不是np.newaxis,你可以使用后者来提高可读性。

经测试,即使借助广播,也比@d.b的第二个回答慢,但也不算太差:

>>> def loop_reduce(ar):
...     row, col = ar.nonzero()
...     rows, cols = np.indices(ar.shape)
...     return np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
...
>>> def broadcast_reduce(ar):
...     row, col = ar.nonzero()
...     rows, cols = np.indices(ar.shape, sparse=True)
...     i = np.abs(rows[None] - row[:, None, None])
...     j = np.abs(cols[None] - col[:, None, None])
...     return (np.maximum(i, j) == 1).any(0)
...
>>> def max_filter(ar):
...     ans = maximum_filter(ar, size=(3, 3))
...     ans[ar] = False
...     return ans
...
>>> timeit(lambda: loop_reduce(ar), number=10000)
0.3127206000208389
>>> timeit(lambda: broadcast_reduce(ar), number=10000)
0.13910009997198358
>>> timeit(lambda: max_filter(ar), number=10000)
0.12893440001062118

至少这可以成为你以后解决类似问题的一种方式:-)