dropout 层在冻结的 Keras 模型中是否仍然活跃（即 trainable=False）？

Question

我有两个经过训练的模型（model_A 和 model_B），它们都有 dropout 层。我冻结了 model_A 和 model_B 并将它们与新的密集层合并以获得 model_AB（但我没有删除 model_A 和 model_B辍学层）。 model_AB 的权重将不可训练，除了添加的密集层。

现在我的问题是：当我训练 model_AB 时，model_A 和 model_B 中的 dropout 层是否活跃（即丢弃神经元）？

Answer 1

简短回答：dropout 层将在训练期间继续丢弃神经元，即使你设置它们的trainable属性到 False.

长答案： Keras 中有两个截然不同的概念：

更新图层的权重和状态： 这是使用该图层的 trainable 属性控制的，即如果你设置 layer.trainable = False 则层的权重和内部状态将不会更新。
一个层在训练和测试阶段的行为： 如你所知，一个层，比如 dropout，在训练和测试阶段可能有不同的行为。 Keras 中的学习阶段使用 keras.backend.set_learning_phase() 设置。例如，当您调用 model.fit(...) 时，学习阶段会自动设置为 1（即训练），而当您使用 model.predict(...) 时，它将自动设置为 0（即测试）。此外，请注意，学习阶段 1（即训练）并不一定意味着更新层的 weighs/states。您可以运行学习阶段为 1（即训练阶段）的模型，但不会更新权重；只有图层会切换到它们的训练行为（有关更多信息，请参阅 for more information). Further, there is another way to set learning phase for each individual layer by passing training=True argument when calling a layer on a tensor (see ）。

因此，根据以上几点，当您在 dropout 层上设置 trainable=False 并在训练模式下使用它时（例如，通过调用 model.fit(...)，或像示例一样手动将学习阶段设置为训练下面），神经元仍然会被 dropout 层丢弃。

这是一个可重现的例子，它说明了这一点：

from keras import layers
from keras import models
from keras import backend as K
import numpy as np

inp = layers.Input(shape=(10,))
out = layers.Dropout(0.5)(inp)

model = models.Model(inp, out)
model.layers[-1].trainable = False  # set dropout layer as non-trainable
model.compile(optimizer='adam', loss='mse') # IMPORTANT: we must always compile model after changing `trainable` attribute

# create a custom backend function so that we can control the learning phase
func = K.function(model.inputs + [K.learning_phase()], model.outputs)

x = np.ones((1,10))
# learning phase = 1, i.e. training mode
print(func([x, 1]))
# the output will be:
[array([[2., 2., 2., 0., 0., 2., 2., 2., 0., 0.]], dtype=float32)]
# as you can see some of the neurons have been dropped

# now set learning phase = 0, i.e test mode
print(func([x, 0]))
# the output will be:
[array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]], dtype=float32)]
# unsurprisingly, no neurons have been dropped in test phase

Answer 2

Dropout 层在训练期间的每一步随机将输入单元设置为 0，频率为 rate，这有助于防止过度拟合。未设置为 0 的输入按比例放大 1/(1 - 比率)，以便所有输入的总和不变。

请注意，Dropout 层仅在训练设置为 True 时适用，这样在推理过程中不会丢弃任何值。使用 model.fit 时，training 会自动适当地设置为 True，在其他情况下，您可以在调用层时将 kwarg 显式设置为 True。

（这与为 Dropout 层设置 trainable=False 相反。trainable 不会影响层的行为，因为 Dropout 没有任何 variables/weights 可以在训练期间冻结。）

查看官方文档here.

dropout 层在冻结的 Keras 模型中是否仍然活跃（即 trainable=False）？

Is dropout layer still active in a freezed Keras model (i.e. trainable=False)?

machine-learning

keras

tensorflow

keras-layer

dropout