在 GPU 和模型加载上进行多重评估的不同结果

Question

我刚刚第一次使用ModelCheckpoint功能来保存最好的模型（best_model = True），想测试一下它的性能。保存模型时，它说 val_acc 在保存前为 83.3%。我加载了模型并在 validation_generator 上使用了 evaluate_generator，但 val_acc 的结果是 0.639。我感到困惑并再次使用它并得到 0.654，然后是 0.647、0.744 等等。我在我的 PC 上测试了相同的配置（没有 GPU），它始终显示相同的结果（有时可能会有小的舍入误差）

为什么不同 evaluate_generator 执行之间的结果仅在 GPU 上不同？
为什么模型 val_acc 与报告的不同？

我正在使用 Keras 的 Tensorflows 实现。

model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])
checkpointer = ModelCheckpoint(filepath='/tmp/weights.hdf5', monitor = "val_acc", verbose=1, save_best_only=True)
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale = 1./ 255,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size = (img_height, img_width),
    batch_size = batch_size)
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size = (img_height, img_width),
    batch_size = batch_size)
# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch = math.ceil(train_samples/batch_size),
    epochs=100,
    workers = 120,
    validation_data=validation_generator,
    validation_steps=math.ceil(val_samples/batch_size),
    callbacks=[checkpointer])
model.load_weights(filepath='/tmp/weights.hdf5')
model.predict_generator(validation_generator, steps = math.ceil(val_samples/batch_size) )
temp_model = load_model('/tmp/weights.hdf5')
temp_model.evaluate_generator(validation_generator, steps = math.ceil(val_samples/batch_size), workers = 120)
>>> [2.1996076788221086, 0.17857142857142858]
temp_model.evaluate_generator(validation_generator, steps = math.ceil(val_samples/batch_size), workers = 120)
>>> [2.2661823204585483, 0.25]

Answer 1

因为你只保存了模型权重。这意味着您没有保存优化器状态，这解释了重新加载模型时准确性的差异。如果在创建 ModelCheckpoint 时添加 save_weights_only=False，问题将得到解决：

如果重新加载模型，请使用 Keras 的 load_model 函数。否则你仍然只会加载权重。

checkpointer = ModelCheckpoint(filepath='/tmp/full_model.hdf5', monitor = "val_acc", verbose=1, save_best_only=True, save_weights_only=False)

#reload model
from keras.models import load_model
model = load_model('/tmp/full_model.hdf5')

Answer 2

好的，问题如下 - batch_size！我花了很长时间才弄明白这一点 -

steps = math.ceil(val_samples/batch_size)

由于 batch_size 不是 number_of_samples 的约数，因此产生了问题。设置 workers 变量也发生了一些小错误 - 使用 GPU 实际使用它是没有意义的。

在 GPU 和模型加载上进行多重评估的不同结果

Different results on multiple evaluating on the GPU and model loading

python

neural-network

keras

tensorflow

tensorflow-gpu