如何在 numpy 中快速检查数组的每个单元格是否具有指定值的邻居?
How would I check if each cell of an array has neighbors of a specified value quickly in numpy?
假设我有一个像
这样的数组
np.array([[0,0,0,1,0],
[0,0,0,0,0],
[0,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,0]],dtype=bool)
并且我想要一个包含所有值的布尔数组,其中相邻单元格在该数组中不为 0,例如:
np.array([[0,0,1,0,1],
[1,1,1,1,1],
[1,0,1,1,1],
[1,1,1,0,1],
[0,0,1,1,1]],dtype=bool)
如果不在 python 循环中遍历所有内容(因为那真的很慢),我将如何做到这一点?
您可以使用滑动 window 在 window 中取最大值。
def foo(arr, window):
r, c = arr.shape
wr, wc = window
ans = arr * 0
for i in range(r):
for j in range(c):
if not arr[i, j]:
ans[i, j] = arr[max(i - wr, 0):min(i + wr + 1, r), max(j - wc, 0):min(j + wc + 1, c)].max()
else:
ans[i, j] = 0
return ans
data = np.array([[0,0,0,1,0],
[0,0,0,0,0],
[0,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,0]])
foo(data, [1, 1])
# array([[0, 0, 1, 0, 1],
# [1, 1, 1, 1, 1],
# [1, 0, 1, 1, 1],
# [1, 1, 1, 0, 1],
# [0, 0, 1, 1, 1]])
或
from scipy.ndimage import maximum_filter
ans = maximum_filter(data, size=(3, 3))
ans[data == 1] = 0
ans
# array([[0, 0, 1, 0, 1],
# [1, 1, 1, 1, 1],
# [1, 0, 1, 1, 1],
# [1, 1, 1, 0, 1],
# [0, 0, 1, 1, 1]])
如果只想用numpy
,我的方法是在原数组中找出所有真值的邻居,计算方法是判断是否切比雪夫距离(L-infinite distance) 元素在数组中的位置与真值位置之间的距离为1,然后用逻辑或运算合并:
>>> ar = np.array([[0,0,0,1,0],
... [0,0,0,0,0],
... [0,1,0,0,0],
... [0,0,0,1,0],
... [0,0,0,0,0]], bool)
>>> row, col = ar.nonzero()
>>> rows, cols = np.indices(ar.shape)
>>> np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
array([[False, False, True, False, True],
[ True, True, True, True, True],
[ True, False, True, True, True],
[ True, True, True, False, True],
[False, False, True, True, True]])
通过广播,还可以避免列表理解:
>>> rows, cols = np.indices(ar.shape, sparse=True) # Setting to sparse does not affect the calculation.
>>> i = np.abs(rows[None] - row[:, None, None])
>>> j = np.abs(cols[None] - col[:, None, None])
>>> (np.maximum(i, j) == 1).any(0)
array([[False, False, True, False, True],
[ True, True, True, True, True],
[ True, False, True, True, True],
[ True, True, True, False, True],
[False, False, True, True, True]])
为了让代码看起来更短,我使用了None
而不是np.newaxis
,你可以使用后者来提高可读性。
经测试,即使借助广播,也比@d.b的第二个回答慢,但也不算太差:
>>> def loop_reduce(ar):
... row, col = ar.nonzero()
... rows, cols = np.indices(ar.shape)
... return np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
...
>>> def broadcast_reduce(ar):
... row, col = ar.nonzero()
... rows, cols = np.indices(ar.shape, sparse=True)
... i = np.abs(rows[None] - row[:, None, None])
... j = np.abs(cols[None] - col[:, None, None])
... return (np.maximum(i, j) == 1).any(0)
...
>>> def max_filter(ar):
... ans = maximum_filter(ar, size=(3, 3))
... ans[ar] = False
... return ans
...
>>> timeit(lambda: loop_reduce(ar), number=10000)
0.3127206000208389
>>> timeit(lambda: broadcast_reduce(ar), number=10000)
0.13910009997198358
>>> timeit(lambda: max_filter(ar), number=10000)
0.12893440001062118
至少这可以成为你以后解决类似问题的一种方式:-)
假设我有一个像
这样的数组 np.array([[0,0,0,1,0],
[0,0,0,0,0],
[0,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,0]],dtype=bool)
并且我想要一个包含所有值的布尔数组,其中相邻单元格在该数组中不为 0,例如:
np.array([[0,0,1,0,1],
[1,1,1,1,1],
[1,0,1,1,1],
[1,1,1,0,1],
[0,0,1,1,1]],dtype=bool)
如果不在 python 循环中遍历所有内容(因为那真的很慢),我将如何做到这一点?
您可以使用滑动 window 在 window 中取最大值。
def foo(arr, window):
r, c = arr.shape
wr, wc = window
ans = arr * 0
for i in range(r):
for j in range(c):
if not arr[i, j]:
ans[i, j] = arr[max(i - wr, 0):min(i + wr + 1, r), max(j - wc, 0):min(j + wc + 1, c)].max()
else:
ans[i, j] = 0
return ans
data = np.array([[0,0,0,1,0],
[0,0,0,0,0],
[0,1,0,0,0],
[0,0,0,1,0],
[0,0,0,0,0]])
foo(data, [1, 1])
# array([[0, 0, 1, 0, 1],
# [1, 1, 1, 1, 1],
# [1, 0, 1, 1, 1],
# [1, 1, 1, 0, 1],
# [0, 0, 1, 1, 1]])
或
from scipy.ndimage import maximum_filter
ans = maximum_filter(data, size=(3, 3))
ans[data == 1] = 0
ans
# array([[0, 0, 1, 0, 1],
# [1, 1, 1, 1, 1],
# [1, 0, 1, 1, 1],
# [1, 1, 1, 0, 1],
# [0, 0, 1, 1, 1]])
如果只想用numpy
,我的方法是在原数组中找出所有真值的邻居,计算方法是判断是否切比雪夫距离(L-infinite distance) 元素在数组中的位置与真值位置之间的距离为1,然后用逻辑或运算合并:
>>> ar = np.array([[0,0,0,1,0],
... [0,0,0,0,0],
... [0,1,0,0,0],
... [0,0,0,1,0],
... [0,0,0,0,0]], bool)
>>> row, col = ar.nonzero()
>>> rows, cols = np.indices(ar.shape)
>>> np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
array([[False, False, True, False, True],
[ True, True, True, True, True],
[ True, False, True, True, True],
[ True, True, True, False, True],
[False, False, True, True, True]])
通过广播,还可以避免列表理解:
>>> rows, cols = np.indices(ar.shape, sparse=True) # Setting to sparse does not affect the calculation.
>>> i = np.abs(rows[None] - row[:, None, None])
>>> j = np.abs(cols[None] - col[:, None, None])
>>> (np.maximum(i, j) == 1).any(0)
array([[False, False, True, False, True],
[ True, True, True, True, True],
[ True, False, True, True, True],
[ True, True, True, False, True],
[False, False, True, True, True]])
为了让代码看起来更短,我使用了None
而不是np.newaxis
,你可以使用后者来提高可读性。
经测试,即使借助广播,也比@d.b的第二个回答慢,但也不算太差:
>>> def loop_reduce(ar):
... row, col = ar.nonzero()
... rows, cols = np.indices(ar.shape)
... return np.any([np.maximum(np.abs(rows - i), np.abs(cols - j)) == 1 for i, j in zip(row, col)], 0)
...
>>> def broadcast_reduce(ar):
... row, col = ar.nonzero()
... rows, cols = np.indices(ar.shape, sparse=True)
... i = np.abs(rows[None] - row[:, None, None])
... j = np.abs(cols[None] - col[:, None, None])
... return (np.maximum(i, j) == 1).any(0)
...
>>> def max_filter(ar):
... ans = maximum_filter(ar, size=(3, 3))
... ans[ar] = False
... return ans
...
>>> timeit(lambda: loop_reduce(ar), number=10000)
0.3127206000208389
>>> timeit(lambda: broadcast_reduce(ar), number=10000)
0.13910009997198358
>>> timeit(lambda: max_filter(ar), number=10000)
0.12893440001062118
至少这可以成为你以后解决类似问题的一种方式:-)