Keras Sequential Dense 输入层 - 和 MNIST：为什么图像需要重塑？

Question

我问这个是因为我觉得我缺少一些基本的东西。

现在大多数人都知道 MNIST 图像是 28X28 像素。 keras documentation 告诉我有关 Dense 的信息：

Input shape nD tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

所以像我这样的新手会假设图像可以作为 28*28 矩阵输入到模型中。然而，我找到的每个教程都经过各种体操，将图像转换为一个 784 长的特征。

有时

num_pixels = X_train.shape[1] * X_train.shape[2]
model.add(Dense(num_pixels, input_dim=num_pixels, activation='...'))

或

num_pixels = np.prod(X_train.shape[1:])
model.add(Dense(512, activation='...', input_shape=(num_pixels,)))

或

model.add(Dense(units=10, input_dim=28*28, activation='...'))
history = model.fit(X_train.reshape((-1,28*28)), ...)

甚至：

model = Sequential([Dense(32, input_shape=(784,)), ...),])

所以我的问题很简单 - 为什么？ Dense 不能按原样接受图像，或者在必要时按原样处理它 "behind the scenes" 吗？如果像我怀疑的那样必须进行此处理，那么这些方法（或其他方法）中的任何一种在本质上更可取吗？

Answer 1

应OP（即发帖人）的要求，我将在评论中提及我给出的答案并详细说明。

Can't Dense just accept an image as-is or, if necessary, just process it "behind the scenes", as it were?

根本没有！那是因为。因此，如果您为其提供形状为 (height, width) 或 (height, width, channels) 的图像，密集层将仅应用于最后一个轴（即宽度或通道）。然而，当图像被扁平化时，Dense 层中的所有单元将应用于整个图像，并且每个单元连接到具有不同权重的所有像素。为了进一步阐明这一点，请考虑以下模型：

model = models.Sequential()
model.add(layers.Dense(10, input_shape=(28*28,)))
model.summary()

模型摘要：

Layer (type)                 Output Shape              Param #   
=================================================================
dense_2 (Dense)              (None, 10)                7850      
=================================================================
Total params: 7,850
Trainable params: 7,850
Non-trainable params: 0
_________________________________________________________________

如您所见，Dense 层中有 7850 个参数：每个单元连接到所有像素（28*28*10 + 10 个偏置参数 = 7850）。现在考虑这个模型：

model = models.Sequential()
model.add(layers.Dense(10, input_shape=(28,28)))
model.summary()

模型摘要：

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_3 (Dense)              (None, 28, 10)            290       
=================================================================
Total params: 290
Trainable params: 290
Non-trainable params: 0
_________________________________________________________________

在这种情况下，Dense 层中只有 290 个参数。这里 Dense 层中的每个单元也连接到所有像素，但不同之处在于权重在第一个轴上共享（28*10 + 10 偏置参数 = 290）。与之前在整个图像中提取特征的模型相比，就好像是从图像的每一行中提取特征一样。因此，这（即权重共享）可能对您的应用程序有用，也可能没有用。

Keras Sequential Dense 输入层 - 和 MNIST：为什么图像需要重塑？

Keras Sequential Dense input layer - and MNIST: why do images need to be reshaped?

python

machine-learning

mnist

keras

keras-layer