仅使用 numpy 最大池化 2x2 数组

Question

我需要使用 numpy 进行最大池化方面的帮助。我正在学习 Python 的数据科学，这里我必须对 2x2 矩阵进行最大池化和平均池化，输入可以是 8x8 或更多，但我必须对每个 2x2矩阵。我使用

创建了一个矩阵

k = np.random.randint(1,64,64).reshape(8,8)

所以我将得到 8x8 矩阵作为随机输出。形成我想要做 2x2 max pooling 的结果。提前致谢

Answer 1

您可以使用 np.lib.stride_tricks 解决卷积部分，这实际上是 numpy 在后台从其方法生成视图的方式。不过要小心，这是对 numpy 数组的内存级访问。

对 (8,8) 矩阵进行卷积得到 (4,4) 形状 (2,2) 的矩阵。
使用诸如均值之类的池化操作减少 (2,2) 矩阵以获得 (4,4) 输出。

这种方法无需任何修改即可扩展到更大的矩阵，也可以适应更大的卷积。

k = np.random.randint(1,64,64).reshape(8,8)

#Strides
x,y = 2,2

shape = k.shape[0]//x, k.shape[1]//y, x, y  
strides = k.strides[0]*x, k.strides[1]*y, k.strides[0], k.strides[1]

print('expected shape:',shape)
print('required strides:',strides)

convolve = np.lib.stride_tricks.as_strided(k, shape=shape, strides=strides)
print('convolution output shape:',convolve.shape)

maxpool = np.mean(convolve, axis=(-1,-2))
print('maxpooled output shape:',maxpool.shape)


print(' ')
print('Input matrix:')
print(k)
print('--------')
print('Output matrix:')
print(maxpool)

expected shape: (4, 4, 2, 2)
required strides: (128, 16, 64, 8)
convolution output shape: (4, 4, 2, 2)
maxpooled output shape: (4, 4)
 
Input matrix:
[[19 32 28 25 31 49 17 18]
 [ 4 19 50 57 29 42  5  8]
 [44 16 54 13 15  1 58 50]
 [18 36 29 12 39 45 47 44]
 [34 31 17 28 35 62 30 54]
 [38 50 14 50 25 24 36  4]
 [58 27 20 34 55 22 63 59]
 [61 30 37 24 23 34  5 16]]
--------
Output matrix:
[[18.5  40.   37.75 12.  ]
 [28.5  27.   25.   49.75]
 [38.25 27.25 36.5  31.  ]
 [44.   28.75 33.5  35.75]]

确认一下，如果您只取矩阵的第一个 (2,2) window 并对其应用均值池化，您将得到 18.5，这是输出矩阵的第一个值，正如预期的那样.

first_window = [[19,32],
                 [4,19]]

np.mean(first_window)

# 18.5

解释

Numpy 将其 ndarray 存储为连续的内存块。每个元素在前一个元素之后每隔 n 个字节按顺序存储。

因此，如果您的 3D 阵列看起来像这样 -

np.arange(0,16).reshape(2,2,4)

#array([[[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7]],
#
#       [[ 8,  9, 10, 11],
#        [12, 13, 14, 15]]])

然后在内存中存储为 -

当检索一个元素（或元素块）时，NumPy 计算需要遍历多少 strides（每个 8 个字节）以获得下一个元素 in that direction/axis。因此，对于上面的示例，对于 axis=2 它必须遍历 8 个字节（取决于 datatype）但是对于 axis=1 它必须遍历 8*4 个字节，并且 axis=0 它需要 8*8 字节。

这就是 arr.strides 的用武之地。它显示了访问该方向下一个元素所需的字节数。

对于 (8,8) 矩阵的情况 -

您想将 8x8 矩阵在每个方向上按 (2,2) 步进行卷积，从而得到 (4,4,2,2) 形状的矩阵。然后你想减少你的 maxpooling 步骤中的最后 2 个维度，平均生成一个 (4,4) 矩阵。
shape 是您定义的预期形状，在本例中为 (4,4,2,2)
卷积需要访问内存，但是在每个方向上采取 2 步（k.strides[0]*2 = 128 字节和 k.strides1* 2 = 16 字节获取第一个元素的 (2,2) window，然后获取另一个 (64,8) 字节。

NOTE: The try to NEVER hardcode the strides/shapes in this function. Can result in memory issue. Always use calculate the expected strides and shape from the strides and shapes of the original matrix.

希望这对您有所帮助。阅读更多关于 stride_tricks here and here.

Answer 2

你不必自己计算必要的步幅，你可以只注入两个辅助维度来创建一个 4d 数组，它是 2x2 块矩阵的 2d 集合，然后在块上取元素最大值：

import numpy as np

# use 2-by-3 size to prevent some subtle indexing errors
arr = np.random.randint(1, 64, 6*4).reshape(6, 4)

m, n = arr.shape
pooled = arr.reshape(m//2, 2, n//2, 2).max((1, 3))

上面的示例实例：

>>> arr
array([[40, 24, 61, 60],
       [ 8, 11, 27,  5],
       [17, 41,  7, 41],
       [44,  5, 47, 13],
       [31, 53, 40, 36],
       [31, 23, 39, 26]])

>>> pooled
array([[40, 61],
       [44, 47],
       [53, 40]])

对于不假设 2×2 块的完全通用的块池：

import numpy as np

# again use coprime dimensions for debugging safety
block_size = (2, 3)
num_blocks = (7, 5)
arr_shape = np.array(block_size) * np.array(num_blocks)
numel = arr_shape.prod()
arr = np.random.randint(1, numel, numel).reshape(arr_shape)

m, n = arr.shape  # pretend we only have this
pooled = arr.reshape(m//block_size[0], block_size[0],
                     n//block_size[1], block_size[1]).max((1, 3))

仅使用 numpy 最大池化 2x2 数组

Maxpooling 2x2 array only using numpy

python

numpy

numpy-ndarray

numpy-slicing

解释