Keras 的推理损失和前向传播不匹配

Keras loss of inference and forward propagation don't match

enter image description here

我正在使用 Keras 预训练模型 ResNet50 来训练我自己的数据集,其中仅包含一张用于测试目的的图像。首先,我用我的图像评估模型,损失为 0.5,精度为 1。然后,我拟合模型,损失为 6,精度为 0。我不明白为什么推理损失和前向传播不匹配。看起来 Keras 中推理和前向传播的行为是不同的。我附上了我的代码片段和它的屏幕截图。

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

y = np.zeros((1, 1000))
y[0, 386] = 1

model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['categorical_accuracy'])

model.evaluate(x, y)

1/1 [==============================] - 1s 547ms/step [0.5232877135276794, 1.0]

model.fit(x, y, validation_data=(x, y))

Train on 1 samples, validate on 1 samples Epoch 1/1 1/1 [==============================] - 3s 3s/step - loss: 6.1883 - categorical_accuracy: 0.0000e+00 - val_loss: 9.8371e-04 - val_categorical_accuracy: 1.0000

model.evaluate(x, y)

1/1 [==============================] - 0s 74ms/step [0.0009837078396230936, 1.0]

抱歉一开始误解了这个问题。问题很棘手。问题很可能是由 BatchNorm 层引起的,正如评论中提到的 @Natthaphon 所言,因为我在 VGG16 上试过,损失是匹配的。

然后我在ResNet50上测试,eval loss和fit loss即使我"freeze"所有层仍然不匹配。实际上,我手动检查了 BN 权重,它们确实没有改变。

from keras.applications import ResNet50, VGG16
from keras.applications.resnet50 import preprocess_input
from keras_preprocessing import image
import keras
from keras import backend as K
import numpy as np

img_path = '/home/zhihao/Downloads/elephant.jpeg'
img = image.load_img(img_path, target_size=(224, 224))

model = ResNet50(weights='imagenet')

for layer in model.layers:
    layer.trainable = False

x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

y = np.zeros((1, 1000))
y[0, 386] = 1

model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['categorical_accuracy'])

model.evaluate(x, y)
# 1/1 [==============================] - 2s 2s/step
# [0.2981376349925995, 1.0]

model.fit(x, y, validation_data=(x, y))
# Train on 1 samples, validate on 1 samples
# Epoch 1/1
# 1/1 [==============================] - 1s 549ms/step - loss: 5.3056 - categorical_accuracy: 0.0000e+00 - val_loss: 0.2981 - val_categorical_accuracy: 1.0000

我们可以注意到 eval loss 是 0.2981,fit loss 是 5.3056。我猜 Batch Norm 层在 eval 模式和 train 模式之间有不同的行为。如果我错了纠正我。

真正冻结我发现的模型的一种方法是使用 K.set_learning_phase(0),如下所示,

model = ResNet50(weights='imagenet')

K.set_learning_phase(0)  # all new operations will be in test mode from now on

model.fit(x, y, validation_data=(x, y))

# Train on 1 samples, validate on 1 samples
# Epoch 1/1
# 1/1 [==============================] - 4s 4s/step - loss: 0.2981 - categorical_accuracy: 1.0000 - val_loss: 16.1181 - val_categorical_accuracy: 0.0000e+00

现在两败俱伤。