高效的复杂蒙版移动 window 分析

Question

我正在 Python 中开发一个移动 window 算法，它将用于在大型 numpy 数组中滑动（例如，我的测试数组尺寸为 6349x9849）。我需要在 25 x 25 移动 window 中计算许多不同的统计数据，每个像素位置都被 9x9 window 掩盖。

还有一个警告阻止我使用卷积（例如计算移动 window 中的平均值非常快）：如果移动 window 中的中心像素是0，我将统计值设置为-9999作为标志，或者如果25x25移动window包含超过一半的0值，我将统计值设置为-1作为标志。然后我可以稍后处理这些标志。

我已经编写了 Python 代码来执行此操作，但这是我 Python 学习中第一次处理特别大量的数据，因此遇到了效率问题 -我编写的代码需要很长时间（对于这个图像大小，每个统计数据大约需要 6 个小时......！）。

我想征求任何建议，我如何才能更有效地做到这一点。我想先优化代码，然后再投入更多的计算能力（我相信我可以使用多处理模块来做到这一点。

我的代码，例如统计数据（标准差）如下。我为每个统计数据重复这一行（我有 6 个统计数据要计算）：

# Calculate the standard deviation of the masked moving window
stats_std = [-9999 if ds_array[row,column] == 0
else -1 if np.count_nonzero(ds_array[row-border_buff:row+border_buff+1,column-border_buff:column+border_buff+1]) < (outer_box**2)/2
else np.std([i for i in np.ma.compressed(np.ma.masked_array(ds_array[row-border_buff:row+border_buff+1,column-border_buff:column+border_buff+1],mask)) if i!=0])
for row in range(border_buff,m-border_buff)
for column in range(border_buff,n-border_buff)]

# Reshape the list into the image dimensions
stats_std = np.reshape(stats_std,(m-2*border_buff,n-2*border_buff))

如果需要，我可以提供一个示例子集数组和所需的输出以供试用，但不确定提供此内容的最佳方式，如果需要使上述内容更清楚，请告诉我。

P.s。我尝试了一种方法，将 2d 数组转换为数组的所有 25x25 子集的 3d，然后在每个子集的一个步骤中有效地计算 6 个统计数据，希望能节省大量计算。但这导致了 MemoryError。

Answer 1

pyvips lets you calculate complex things on huge images efficiently and using very little memory。它是 LGPL，运行s 在 Linux、macOS 和 Windows 上，适用于 Python 的每个版本。大多数 linux 的包管理器中都有它。

当你在pyvips中做像a + b这样的操作时，它实际上并没有做任何处理，它只是在图像处理操作的图表中添加了另一个节点。当您最终将结果写在某个地方时，整个图形都会进行评估，并且它会将图像分成一组小块并并行地流式传输到您的系统中。

因为中间图像并不存在，所以你只需要少量的内存，而且因为它是并行的，所以很快。

例如，您可以这样计算 sdev：

import sys
import pyvips

# load the input image ... the access hint means we will only make a single
# top-to-bottom pass over image, and it can therefore be streamed 
image = pyvips.Image.new_from_file(sys.argv[1], access='sequential')

# our convolution ... total pixels in an M x M window
# it's a simple box filter, so we can use a seperable convolution 
# (two 1D filters at 90 degrees)
window_size = 25
size = window_size * window_size
sum_mask = pyvips.Image.new_from_array([1] * window_size)

# standard deviation ... sum and sum of squares
s = image.convsep(sum_mask)
s2 = (image * image).convsep(sum_mask)
sdev = ((s2 - (s * s / size)).abs() / (size - 1)) ** 0.5

# find all zero input pixels ... these become -9999 in the output
sdev = (image == 0).ifthenelse(-9999, sdev)

# find all pixels where more than half of the window is zero ... these become
# -1 in the output
# pyvips uses 255 for TRUE and 0 for FALSE
more_than_half_zero = (image == 0).convsep(sum_mask) > 255 * size / 2
sdev = more_than_half_zero.ifthenelse(-1, sdev)

sdev.write_to_file(sys.argv[2])

我可以运行这样：

$ vipsheader x.jpg 
x.jpg: 10000x10000 uchar, 1 band, b-w, jpegload
$ /usr/bin/time -f %M:%e python3 sdev.py x.jpg x.pfm
81432:4.11

从 10,000 x 10,000 像素的 jpg 图像制作 10,000 x 10,000 像素的 PFM（一种可以存储浮点值的简单格式），其中 PFM 中的每个像素都是相应 25 x 25 window，加上你的零规则。在这个桌面上需要4s，最大81mb的内存。

我可以只使用一个线程来减少内存使用，但当然要慢很多：

$ VIPS_CONCURRENCY=1 /usr/bin/time -f %M:%e python3 sdev.py x.jpg x.pfm
54128:16.92

现在只有54mb的内存，但是17s的时间。

您也可以从 numpy 数组读取和写入图像，请参阅 chapter in the docs。

高效的复杂蒙版移动 window 分析

Efficient complex masked moving window analysis

python

arrays

moving-average

sliding-window