架构 MNIST，全连接第 1 层，输出大小

Architecture MNIST, fully connected layer 1, output size

我不明白其中的一部分 (quora: How does the last layer of a ConvNet connects to the first fully connected layer):

Make an one hot representation of feature maps. So we would have 64 * 7 * 7 = 3136 input features which is again processed by a 3136 neurons reducing it to 1024 features. The matrix multiplication this layer would be (1x3136) * (3136x1024) => 1x1024

我的意思是，使用 3136 个神经元将 3136 个输入减少到 1024 个特征的过程是什么？

我会通俗易懂地解释一下我的理解。

特征图的一种热门表示是一种使用 1 和 0 的矩阵表示分类值的方法。这是机器 read/process 数据（在您的示例中，图像或照片）。然后 ig 使用矩阵代数进行计算。

现在的计算部分是将 1 行 3136 列的二进制值（1 或 0）与另一个大小为 3136 行和 1024 列的矩阵相乘。当您将这两个矩阵相乘时，生成的矩阵为 1 行 1024 列。这是代表您的图像或图片的 1 和 0 的矩阵。

希望我答对了你的问题。

你需要了解矩阵乘法。 (1x3136) * (3136x1024) 是矩阵乘法的一个例子，第一个乘法器的（（1x3136））列号必须等于第二个乘法器的（3136x1024）行号。这导致 (1x1024)，因为第一个乘数的行成为结果的行，而第二个乘数的列成为结果的列。

另外，检查一下：

https://www.khanacademy.org/math/precalculus/precalc-matrices/multiplying-matrices-by-matrices/v/multiplying-a-matrix-by-a-matrix

架构 MNIST，全连接第 1 层，输出大小

Architecture MNIST, fully connected layer 1, output size

machine-learning

deep-learning

conv-neural-network

convolutional-neural-network