Dense 层之间的展平操作
Flatten operation between Dense layers
在Keras实现中,我曾经看到最后两个全连接层定义如下
outX = Dense(300, activation='relu')(outX)
outX = Flatten()(outX)
predictions = Dense(1,activation='linear')(outX)
在两个Dense层之间,有一个Flatten层,为什么要在两个全连接层之间加一个Flatten操作。总是需要这样做吗?
简短回答: Flatten 层没有任何参数来学习自身。但是在模型中加入Flatten层可以增加模型的学习参数。
示例: 尝试找出这两个模型之间的区别:
1) 没有 Flatten
:
inp = Input(shape=(20,10,))
A = Dense(300, activation='relu')(inp)
#A = Flatten()(A)
A = Dense(1, activation='relu')(A)
m = Model(inputs=inp,outputs=A)
m.summary()
输出:
input_9 (InputLayer) (None, 20, 10) 0
dense_20 (Dense) (None, 20, 300) 3300
dense_21 (Dense) (None, 20, 1) 301
Total params: 3,601
Trainable params: 3,601
Non-trainable params: 0
2) 与 Flatten
:
inp = Input(shape=(20,10,))
A = Dense(300, activation='relu')(inp)
A = Flatten()(A)
A = Dense(1, activation='relu')(A)
m = Model(inputs=inp,outputs=A)
m.summary()
输出:
input_10 (InputLayer) (None, 20, 10) 0
dense_22 (Dense) (None, 20, 300) 3300
flatten_9 (Flatten) (None, 6000) 0
dense_23 (Dense) (None, 1) 6001
Total params: 9,301
Trainable params: 9,301
Non-trainable params: 0
最后,加不加Flatten层取决于手头的数据。有更多的参数要学习可能会导致更准确的模型或可能导致过度拟合。所以,一个答案应该是:"apply both, choose best"
在Keras实现中,我曾经看到最后两个全连接层定义如下
outX = Dense(300, activation='relu')(outX)
outX = Flatten()(outX)
predictions = Dense(1,activation='linear')(outX)
在两个Dense层之间,有一个Flatten层,为什么要在两个全连接层之间加一个Flatten操作。总是需要这样做吗?
简短回答: Flatten 层没有任何参数来学习自身。但是在模型中加入Flatten层可以增加模型的学习参数。
示例: 尝试找出这两个模型之间的区别:
1) 没有 Flatten
:
inp = Input(shape=(20,10,))
A = Dense(300, activation='relu')(inp)
#A = Flatten()(A)
A = Dense(1, activation='relu')(A)
m = Model(inputs=inp,outputs=A)
m.summary()
输出:
input_9 (InputLayer) (None, 20, 10) 0
dense_20 (Dense) (None, 20, 300) 3300
dense_21 (Dense) (None, 20, 1) 301
Total params: 3,601
Trainable params: 3,601
Non-trainable params: 0
2) 与 Flatten
:
inp = Input(shape=(20,10,))
A = Dense(300, activation='relu')(inp)
A = Flatten()(A)
A = Dense(1, activation='relu')(A)
m = Model(inputs=inp,outputs=A)
m.summary()
输出:
input_10 (InputLayer) (None, 20, 10) 0
dense_22 (Dense) (None, 20, 300) 3300
flatten_9 (Flatten) (None, 6000) 0
dense_23 (Dense) (None, 1) 6001
Total params: 9,301
Trainable params: 9,301
Non-trainable params: 0
最后,加不加Flatten层取决于手头的数据。有更多的参数要学习可能会导致更准确的模型或可能导致过度拟合。所以,一个答案应该是:"apply both, choose best"