将 Dropout 添加到 testing/inference 阶段
Adding Dropout to testing/inference phase
我已经在 Keras 中为一些时间序列训练了以下模型:
input_layer = Input(batch_shape=(56, 3864))
first_layer = Dense(24, input_dim=28, activation='relu',
activity_regularizer=None,
kernel_regularizer=None)(input_layer)
first_layer = Dropout(0.3)(first_layer)
second_layer = Dense(12, activation='relu')(first_layer)
second_layer = Dropout(0.3)(second_layer)
out = Dense(56)(second_layer)
model_1 = Model(input_layer, out)
然后我用 model_1
的训练层定义了一个新模型,并添加了具有不同速率 drp
的丢失层:
input_2 = Input(batch_shape=(56, 3864))
first_dense_layer = model_1.layers[1](input_2)
first_dropout_layer = model_1.layers[2](first_dense_layer)
new_dropout = Dropout(drp)(first_dropout_layer)
snd_dense_layer = model_1.layers[3](new_dropout)
snd_dropout_layer = model_1.layers[4](snd_dense_layer)
new_dropout_2 = Dropout(drp)(snd_dropout_layer)
output = model_1.layers[5](new_dropout_2)
model_2 = Model(input_2, output)
然后得到这两个模型的预测结果如下:
result_1 = model_1.predict(test_data, batch_size=56)
result_2 = model_2.predict(test_data, batch_size=56)
我原以为会得到完全不同的结果,因为第二个模型有新的 dropout 层,而且这两个模型是不同的 (IMO),但事实并非如此。两者都产生相同的结果。为什么会这样?
正如我在评论中提到的,Dropout
层在推理阶段(即测试模式)关闭,因此当您使用 model.predict()
时,Dropout
层未激活.但是,如果你想要一个在训练和推理阶段都使用 Dropout
的模型,你可以在调用它时传递 training
参数,as suggested by François Chollet:
# ...
new_dropout = Dropout(drp)(first_dropout_layer, training=True)
# ...
或者,如果您已经训练了模型,现在想在推理模式下使用它并保留 Dropout
层(以及可能在 training/inference 阶段具有不同行为的其他层,例如BatchNormalization
) active,你可以定义一个后端函数来获取模型的输入以及 Keras 学习阶段:
from keras import backend as K
func = K.function(model.inputs + [K.learning_phase()], model.outputs)
# to use it pass 1 to set the learning phase to training mode
outputs = func([input_arrays] + [1.])
我已经在 Keras 中为一些时间序列训练了以下模型:
input_layer = Input(batch_shape=(56, 3864))
first_layer = Dense(24, input_dim=28, activation='relu',
activity_regularizer=None,
kernel_regularizer=None)(input_layer)
first_layer = Dropout(0.3)(first_layer)
second_layer = Dense(12, activation='relu')(first_layer)
second_layer = Dropout(0.3)(second_layer)
out = Dense(56)(second_layer)
model_1 = Model(input_layer, out)
然后我用 model_1
的训练层定义了一个新模型,并添加了具有不同速率 drp
的丢失层:
input_2 = Input(batch_shape=(56, 3864))
first_dense_layer = model_1.layers[1](input_2)
first_dropout_layer = model_1.layers[2](first_dense_layer)
new_dropout = Dropout(drp)(first_dropout_layer)
snd_dense_layer = model_1.layers[3](new_dropout)
snd_dropout_layer = model_1.layers[4](snd_dense_layer)
new_dropout_2 = Dropout(drp)(snd_dropout_layer)
output = model_1.layers[5](new_dropout_2)
model_2 = Model(input_2, output)
然后得到这两个模型的预测结果如下:
result_1 = model_1.predict(test_data, batch_size=56)
result_2 = model_2.predict(test_data, batch_size=56)
我原以为会得到完全不同的结果,因为第二个模型有新的 dropout 层,而且这两个模型是不同的 (IMO),但事实并非如此。两者都产生相同的结果。为什么会这样?
正如我在评论中提到的,Dropout
层在推理阶段(即测试模式)关闭,因此当您使用 model.predict()
时,Dropout
层未激活.但是,如果你想要一个在训练和推理阶段都使用 Dropout
的模型,你可以在调用它时传递 training
参数,as suggested by François Chollet:
# ...
new_dropout = Dropout(drp)(first_dropout_layer, training=True)
# ...
或者,如果您已经训练了模型,现在想在推理模式下使用它并保留 Dropout
层(以及可能在 training/inference 阶段具有不同行为的其他层,例如BatchNormalization
) active,你可以定义一个后端函数来获取模型的输入以及 Keras 学习阶段:
from keras import backend as K
func = K.function(model.inputs + [K.learning_phase()], model.outputs)
# to use it pass 1 to set the learning phase to training mode
outputs = func([input_arrays] + [1.])