Keras 密集层的输入未展平
Keras Dense layer's input is not flattened
这是我的测试代码:
from keras import layers
input1 = layers.Input((2,3))
output = layers.Dense(4)(input1)
print(output)
输出为:
<tf.Tensor 'dense_2/add:0' shape=(?, 2, 4) dtype=float32>
但是发生了什么?
文档说:
Note: if the input to the layer has a rank greater than 2, then it is
flattened prior to the initial dot product with kernel.
当输出重塑时?
目前,与文档中所述相反,Dense
层 is applied on the last axis of input tensor:
Contrary to the documentation, we don't actually flatten it. It's
applied on the last axis independently.
换句话说,如果将具有 m
个单元的 Dense
层应用于形状为 (n_dim1, n_dim2, ..., n_dimk)
的输入张量,它将具有 (n_dim1, n_dim2, ..., m)
的输出形状。
旁注: 这使得 TimeDistributed(Dense(...))
和 Dense(...)
彼此等价。
另一个旁注: 请注意,这具有共享权重的效果。例如,考虑这个玩具网络:
model = Sequential()
model.add(Dense(10, input_shape=(20, 5)))
model.summary()
模型总结:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 20, 10) 60
=================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
_________________________________________________________________
如您所见,Dense
层只有 60 个参数。如何? Dense
层中的每个单元都连接到输入中每一行的 5 个元素,具有 相同的权重 ,因此 10 * 5 + 10 (bias params per unit) = 60
.
更新。这是上面示例的可视化说明:
这是我的测试代码:
from keras import layers
input1 = layers.Input((2,3))
output = layers.Dense(4)(input1)
print(output)
输出为:
<tf.Tensor 'dense_2/add:0' shape=(?, 2, 4) dtype=float32>
但是发生了什么?
文档说:
Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.
当输出重塑时?
目前,与文档中所述相反,Dense
层 is applied on the last axis of input tensor:
Contrary to the documentation, we don't actually flatten it. It's applied on the last axis independently.
换句话说,如果将具有 m
个单元的 Dense
层应用于形状为 (n_dim1, n_dim2, ..., n_dimk)
的输入张量,它将具有 (n_dim1, n_dim2, ..., m)
的输出形状。
旁注: 这使得 TimeDistributed(Dense(...))
和 Dense(...)
彼此等价。
另一个旁注: 请注意,这具有共享权重的效果。例如,考虑这个玩具网络:
model = Sequential()
model.add(Dense(10, input_shape=(20, 5)))
model.summary()
模型总结:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 20, 10) 60
=================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
_________________________________________________________________
如您所见,Dense
层只有 60 个参数。如何? Dense
层中的每个单元都连接到输入中每一行的 5 个元素,具有 相同的权重 ,因此 10 * 5 + 10 (bias params per unit) = 60
.
更新。这是上面示例的可视化说明: