为深度学习展平图像矩阵

Question

我有一个关于将这种情况下的图像矩阵 (64 x 64 pix x 3) 展平为矢量 (12288 x 1) 的问题。

我知道每个图像像素都在一个 (64 X 64) 矩阵中，如果我是对的，这个矩阵的每个元素都是一个长度为 3 的向量，包含该单个像素的 R、G、B 数据像素。所以接下来的第一行是左上角像素的 R、G、B 值：

train_set[0]
>> array([[[17, 31, 56],
        [22, 33, 59],
        [25, 35, 62],

我的问题从这里开始：

当我们将第一个图像数据（在 100 个样本的数据集中）展平时，使用以下代码：

train_set_flatten = train_set.reshape(train_set.shape[0], -1).T

train_set_flatten 的前 3 个元素是第一个像素的 R、G、B 数据：

train_set_flatten[:,0][0:10]
array([17, 31, 56, 22, 33, 59, 25, 35, 62, 25], dtype=uint8)

但是在一些教科书上，我们假设首先列出“R矩阵”的所有元素，然后是“G”，然后是“B”，但我现在的不是这个顺序，我的向量是否正确或者我需要找到另一种方法来展平矩阵？

请看神经网络与深度学习的说明通过 DeepLearning.AI coursera.org

Answer 1

我认为这取决于您的模型设计。如果您为三个通道（R、G、B）设计了三个阵列的模型输入，您可以尝试下面的方法。我们需要先把它分开，然后再整形。

import numpy as np
a = np.array([[17, 31, 56],
        [22, 33, 59],
        [25, 35, 62]])

R = a[:,0]
G = a[:,1]
B = a[:,2]
R = R.reshape(R.shape[0], -1).T
G = G.reshape(G.shape[0], -1).T
B = B.reshape(B.shape[0], -1).T

print(R)
print(G)
print(B)

Answer 2

我找到了答案，引用自 https://community.deeplearning.ai/，作者 Paul Mielke

在下面的代码行中：

train_set.reshape(train_set.shape[0], -1).T

我们可以添加：order='F' or order='C'

train_set.reshape(train_set.shape[0], -1, order='F' / order='C' ).T

The way to think about the difference between “C” and “F” order for an image is to remember that the highest dimension is the RGB color dimension. So what that means is that with “C” order you get all three colors for each pixel together. With “F” order, what you get is all the Red pixel values in order across and down the image, followed by all the Green pixels, followed by all the Blue pixels. So it’s like three separate monochrome images back to back. It’s worth trying the experiment of using “F” order on all your reshapes and then running the training and confirming that you get the same accuracy results. In other words (as I said in my previous post), the algorithm can learn the patterns either way. It just matters that you are consistent in how you do the unrolling. (Paul Mielke)

我用 order='F' / order='C' 训练了一个模型，结果是一样的。

为深度学习展平图像矩阵

Flattening image matrixes for Deep Learning

python

neural-network

deep-learning