Keras 密集层的输入未展平

Question

这是我的测试代码：

from keras import layers
input1 = layers.Input((2,3))
output = layers.Dense(4)(input1)
print(output)

输出为：

<tf.Tensor 'dense_2/add:0' shape=(?, 2, 4) dtype=float32>

但是发生了什么？

文档说：

Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

当输出重塑时？

Answer 1

目前，与文档中所述相反，Dense 层 is applied on the last axis of input tensor:

Contrary to the documentation, we don't actually flatten it. It's applied on the last axis independently.

换句话说，如果将具有 m 个单元的 Dense 层应用于形状为 (n_dim1, n_dim2, ..., n_dimk) 的输入张量，它将具有 (n_dim1, n_dim2, ..., m) 的输出形状。

旁注： 这使得 TimeDistributed(Dense(...)) 和 Dense(...) 彼此等价。

另一个旁注： 请注意，这具有共享权重的效果。例如，考虑这个玩具网络：

model = Sequential()
model.add(Dense(10, input_shape=(20, 5)))

model.summary()

模型总结：

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 20, 10)            60        
=================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
_________________________________________________________________

如您所见，Dense 层只有 60 个参数。如何？ Dense 层中的每个单元都连接到输入中每一行的 5 个元素，具有 相同的权重 ，因此 10 * 5 + 10 (bias params per unit) = 60.

更新。这是上面示例的可视化说明：

Keras 密集层的输入未展平

Keras Dense layer's input is not flattened

python

machine-learning

keras

tensorflow

keras-layer