如果我们解冻层进行微调,编译深度学习模型很重要?

Compilation deep learning model is important if we unfreez the layers for fientuneing?

我正在将医学图像数据集分类为正常与异常,我正在使用ResNet50v2应用迁移学习。我在最后一层做了一点改动,然后为了微调我解冻了图层。 搜索此类查询但找不到任何一个。 我正在使用 KerasTensorFlow。 我的问题是:

  1. 解冻层后重新编译模型重要吗?
  2. 如果我保存检查点并加载最佳检查点,那么在我使用 model.save() 保存模型后,此方法是否适用于任何未来的训练。

# Create a MirroredStrategy.
import tensorflow as tf
from tensorflow import keras

def f1_m(y_true, y_pred):
    precision = precision_m(y_true, y_pred)
    recall = recall_m(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))


def recall_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

def precision_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision
dependencies = {
    'f1_m': f1_m,
    'recall_m': recall_m,
    'precision_m': precision_m,

}
# I am using multi GPU and custom metrices so loading need these custom arguments.
strategy = tf.distribute.MirroredStrategy()
print("Number of devices: {}".format(strategy.num_replicas_in_sync))
# Open a strategy scope.
with strategy.scope():
    adam = tf.keras.optimizers.Adam(learning_rate=0.00005, amsgrad=True, name='Adam',)
    model = keras.models.load_model('/home/classifier/model/resnet50_-36.hdf5', custom_objects=dependencies)
    model.summary()

model.trainable = True
model.compile(loss='binary_crossentropy',optimizer=adam,metrics=metric)

现在我在这里编译模型,我在这里很困惑,解冻后编译很重要还是我必须保存检查点然后加载模型然后解冻并开始训练?

保存模型:

model.save("path")
# Load the save model like before
model.trainable = True
model.compile(loss='binary_crossentropy',optimizer=adam,metrics=metric)

解冻后需要先编译模型再开始训练,之前不需要存盘

谢谢@Noltibus 的回答,但当时我正在寻找一些技术细节,现在我正在探索这些细节,我在这里分享我的经验。

首先,我们有两种类型的训练,我们使用 pre-trained modelsimagenet 数据集上训练。

> 分类器训练:

虽然我们正在尝试更改最后一层并训练我们的初始模型,但特征提取层没有变化。然后我们 select 训练期间的最佳模型,然后通过 解冻 convolutional 层进行微调。

> 通过解冻卷积层进行 Fintune:

最好的模型是 selected,然后根据需要解冻层,然后再次编译模型并拟合模型。

训练分类器的代码:

image_size = 512
input_shape = (image_size, image_size, 3)
pre_trained_model = tf.keras.applications.ResNet50V2(input_shape=input_shape, include_top=False, weights="imagenet")
for layer in pre_trained_model.layers:
    layer.trainable = False

gap = keras.layers.GlobalAveragePooling2D()(pre_trained_model.output, training=False)
output = keras.layers.Dense(1, activation='sigmoid')(gap)
model = keras.Model(inputs=pre_trained_model.input, outputs=output)
model.compile(loss='binary_crossentropy',
              optimizer=adam,
              metrics=metric)#'accuracy'

history = model.fit(train_gen,
                            use_multiprocessing=True,
                            workers=16,
                            epochs=50,     
                            class_weight=class_weights,
                            steps_per_epoch=train_steps,                    
                            validation_data=val_gen,
                            validation_steps=val_steps,                    
                            shuffle=True,
                            callbacks=call_backs)

上面的代码表明我们只是加载了一个 pre-trained 模型,然后更改其最终层并编译和拟合模型。

现在我们正在对相同的数据进行 fine-tuning 以了解更多功能,因此当您解冻时,您必须再次编译模型然后重新训练。请记住,您必须验证要训练哪一层取决于您的问题。

微调代码:

# Create a MirroredStrategy.
import tensorflow as tf
from tensorflow import keras

def f1_m(y_true, y_pred):
    precision = precision_m(y_true, y_pred)
    recall = recall_m(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))


def recall_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
adam = tf.keras.optimizers.Adam(learning_rate=0.00005, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
    return recall

def precision_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision
dependencies = {
    'f1_m': f1_m,
    'recall_m': recall_m,
    'precision_m': precision_m,

}
strategy = tf.distribute.MirroredStrategy()
print("Number of devices: {}".format(strategy.num_replicas_in_sync))
# Open a strategy scope.
with strategy.scope():
    adam = tf.keras.optimizers.Adam(learning_rate=0.00005, amsgrad=True, name='Adam',)
    model = keras.models.load_model('/home/xylexa/Desktop/normal_abnormal/final experiments/10000_sample/finetune/best_resnet50_finetune_10000_sample.h5', custom_objects=dependencies)
    model.summary()

# Let's take a look to see how many layers are in the base model
print("Total number of layers in the Base model: ", len(model.layers))
print("Total number of layers to be Train in Base model: ",len(model.trainable_variables))

如有任何更新和建议,我们将不胜感激。 :)