在 Python 中快速运行滑动 window 图像方法的技巧

Question

Haar 级联分类器使用带有金字塔的滑动 window 方法来检测对象。对我来说，检测图像中的物体大约需要 0.01 秒。但是我的问题是，使用滑动 window 方法时怎么会这么快？（我实现了一个用于检测对象的 CNN，它使用滑动 window 来检测没有金字塔的对象，尽管检测对象需要 2 秒）。我想知道运行滑动 window 速度更快的技巧是什么？我使用了两个循环来大步滑动整个图像并使其并行，但它仍然比 OpenCV 实现慢得多。

Answer 1

最快的方法（根据我的经验）是使用 numpy.lib.stride_tricks.as_strided 函数。实际上，我们所做的是首先使用 numpy 函数生成所有补丁（滑动 window 位置）并将其存储在一个大数组中。然后我们可以将该数组映射到我们的函数。

首先定义shape，shape定义为（图像高度，图像宽度，内核高度，内核宽度）。然后你可以跨越图像的位（即 8 位图像每个像素是 8 位步幅）。在这种情况下，补丁将重复图像的步幅两次。您可以使用 img.strides.

检查步幅

def some_func(roi):
    '''
    simple function to return the mean of the region
    of interest
    '''
    return np.mean(roi)

img = np.zeros((30000,30000), dtype=np.uint8)
img_shape = img.shape

size = 3 # window size i.e. here is 3x3 window

shape = (img.shape[0] - size + 1, img.shape[1] - size + 1, size, size)
strides = 2 * img.strides
patches = np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
patches = patches.reshape(-1, size, size)

output_img = np.array([some_func(roi) for roi in patches])
output_img.reshape(img_size)

在某些情况下，您还可以进行其他增加，例如矢量化您的函数 np.vectorize()。如果您想计算平均值，您也可以使用 output_img = patches.mean(axis=(-1, -2)) 并避免映射到函数或重塑的需要。还有可能更快的方法将数组映射到函数 see this post。我已经给出了这个解决方案，因为任何过程都可以添加到函数中，而且这个问题看起来很笼统。

在 Python 中快速运行滑动 window 图像方法的技巧

Tricks to run sliding window approach on images fast in Python

python

opencv

sliding-window

python-2.7

在 Python 中快速 运行 滑动 window 图像方法的技巧

Tricks to run sliding window approach on images fast in Python

python

opencv

sliding-window

python-2.7

在 Python 中快速运行滑动 window 图像方法的技巧