Keras - 从嵌套模型中提取权重的正确方法

Keras - Proper way to extract weights from a nested model

我有一个嵌套模型,它有一个输入层,并且在输出之前有一些最终的密集层。这是它的代码:

image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)        
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='softmax', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])

现在如果说我想从该模型中提取 densenets 权重并将迁移学习执行到另一个更大的模型,该模型也嵌套了相同的 densenet 模型,但在 dense net 之后还有一些其他层,例如:

image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)        
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])

我是否需要做:modelB.load_weights(<weights.hdf5>, by_name=True)?我还应该命名内部 densenet 吗?如果是的话怎么办?

也许最简单的方法是使用您自己训练的模型,而不是尝试加载模型权重。假设您已经训练了初始模型(从提供的源代码复制并粘贴,对变量名称进行了最少的编辑):

image_input = Input(shape, name='image_input')
# ... intermediery layers elided
x = BatchNormalization()(x)
output = Dropout(0.5)(x)
model_output = Dense(num_class, activation='softmax', name='image_output')(output)
smaller_model = Model(inputs=[image_input], outputs=[model_output])

要将此模型的训练权重用于更大的模型,我们可以简单地声明另一个使用训练权重的模型,然后将新定义的模型用作更大模型的组件。

new_model = Model(image_input, output) # Model that uses trained weights

main_input = Input(shape, name='main_input')
x = new_model(main_input)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
final_model = Model(inputs=[main_input], outputs=[output])

如果有任何不清楚的地方,我很乐意详细说明。

您可以在使用嵌套模型之前,将其放入变量中。 做每件事都变得容易得多:

densenet = DenseNet121(input_shape=shape, include_top=False, 
                       weights=None,backend=keras.backend,
                       layers=keras.layers,
                       models=keras.models,
                       utils=keras.utils)

image_input = Input(shape, name='image_input')
x = densenet(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
......

现在超级简单:

weights = densenet.get_weights()
another_densenet.set_weights(weights)

加载的文件

您还可以打印已加载模型的 model.summary()。密集网将是第一层或第二层(你必须检查这个)。

然后你可以像 densenet = loaded_model.layers[i] 一样得到它。

然后您可以将这些权重转移到新的密集网络,既可以使用上一个答案中的方法,也可以使用 new_model.layers[i].set_weights(densenet.get_weights())