Keras 密集层的输入未展平

Keras Dense layer's input is not flattened

这是我的测试代码:

from keras import layers
input1 = layers.Input((2,3))
output = layers.Dense(4)(input1)
print(output)

输出为:

<tf.Tensor 'dense_2/add:0' shape=(?, 2, 4) dtype=float32>

但是发生了什么?

文档说:

Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

当输出重塑时?

目前,与文档中所述相反,Denseis applied on the last axis of input tensor:

Contrary to the documentation, we don't actually flatten it. It's applied on the last axis independently.

换句话说,如果将具有 m 个单元的 Dense 层应用于形状为 (n_dim1, n_dim2, ..., n_dimk) 的输入张量,它将具有 (n_dim1, n_dim2, ..., m) 的输出形状。


旁注: 这使得 TimeDistributed(Dense(...))Dense(...) 彼此等价。


另一个旁注: 请注意,这具有共享权重的效果。例如,考虑这个玩具网络:

model = Sequential()
model.add(Dense(10, input_shape=(20, 5)))

model.summary()

模型总结:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 20, 10)            60        
=================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
_________________________________________________________________

如您所见,Dense 层只有 60 个参数。如何? Dense 层中的每个单元都连接到输入中每一行的 5 个元素,具有 相同的权重 ,因此 10 * 5 + 10 (bias params per unit) = 60.


更新。这是上面示例的可视化说明: