在 Keras 中,验证准确率始终高于训练准确率
Validation accuracy is always greater than training accuracy in Keras
我正在尝试使用 mnist 数据集训练一个简单的神经网络。出于某种原因,当我获得历史记录(从 model.fit 返回的参数)时,验证精度高于训练精度,这真的很奇怪,但是如果我在评估模型时检查分数,我得到训练准确率高于测试准确率。
无论模型的参数如何,这种情况每次都会发生。另外,如果我使用自定义回调并访问参数'acc'和'val_acc',我发现同样的问题(数字与历史中返回的数字相同)。
请帮帮我!我究竟做错了什么?为什么验证准确率比训练准确率高(你看我在看loss的时候也有同样的问题)
这是我的代码:
#!/usr/bin/env python3.5
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np
from keras import backend
from keras.utils import np_utils
from keras import losses
from keras import optimizers
from keras.datasets import mnist
from keras.models import Sequential
from matplotlib import pyplot as plt
# get train and test data (minst) and reduce volume to speed up (for testing)
(x_train, y_train), (x_test, y_test) = mnist.load_data()
data_reduction = 20
x_train = x_train[:x_train.shape[0] // data_reduction]
y_train = y_train[:y_train.shape[0] // data_reduction]
x_test = x_test[:x_test.shape[0] // data_reduction]
y_test = y_test[:y_test.shape[0] // data_reduction]
try:
IMG_DEPTH = x_train.shape[3]
except IndexError:
IMG_DEPTH = 1 # B/W
labels = np.unique(y_train)
N_LABELS = len(labels)
# reshape input data
if backend.image_data_format() == 'channels_first':
X_train = x_train.reshape(x_train.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2])
X_test = x_test.reshape(x_test.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2])
input_shape = (IMG_DEPTH, x_train.shape[1], x_train.shape[2])
else:
X_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH)
X_test = x_test.reshape(x_test.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH)
input_shape = (x_train.shape[1], x_train.shape[2], IMG_DEPTH)
# convert data type to float32 and normalize data values to range [0, 1]
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
# reshape input labels
Y_train = np_utils.to_categorical(y_train, N_LABELS)
Y_test = np_utils.to_categorical(y_test, N_LABELS)
# create model
opt = optimizers.Adam()
loss = losses.categorical_crossentropy
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(labels), activation='softmax'))
model.compile(optimizer=optimizers.Adam(), loss=losses.categorical_crossentropy, metrics=['accuracy'])
# fit model
history = model.fit(X_train, Y_train, batch_size=64, epochs=50, verbose=True,
validation_data=(X_test, Y_test))
# evaluate model
train_score = model.evaluate(X_train, Y_train, verbose=True)
test_score = model.evaluate(X_test, Y_test, verbose=True)
print("Validation:", test_score[1])
print("Training: ", train_score[1])
print("--------------------")
print("First 5 samples validation:", history.history["val_acc"][0:5])
print("First 5 samples training:", history.history["acc"][0:5])
print("--------------------")
print("Last 5 samples validation:", history.history["val_acc"][-5:])
print("Last 5 samples training:", history.history["acc"][-5:])
# plot history
plt.ion()
fig = plt.figure()
subfig = fig.add_subplot(122)
subfig.plot(history.history['acc'], label="training")
if history.history['val_acc'] is not None:
subfig.plot(history.history['val_acc'], label="validation")
subfig.set_title('Model Accuracy')
subfig.set_xlabel('Epoch')
subfig.legend(loc='upper left')
subfig = fig.add_subplot(121)
subfig.plot(history.history['loss'], label="training")
if history.history['val_loss'] is not None:
subfig.plot(history.history['val_loss'], label="validation")
subfig.set_title('Model Loss')
subfig.set_xlabel('Epoch')
subfig.legend(loc='upper left')
plt.ioff()
input("Press ENTER to close the plots...")
我得到的输出如下:
Validation accuracy: 0.97599999999999998
Training accuracy: 1.0
--------------------
First 5 samples validation: [0.83400000286102294, 0.89200000095367427, 0.91599999904632567, 0.9279999976158142, 0.9399999990463257]
First 5 samples training: [0.47133333333333333, 0.70566666682561241, 0.76933333285649619, 0.81133333333333335, 0.82366666714350378]
--------------------
Last 5 samples validation: [0.9820000019073486, 0.9860000019073486, 0.97800000190734859, 0.98399999713897701, 0.975999997138977]
Last 5 samples training: [0.9540000001589457, 0.95766666698455816, 0.95600000031789145, 0.95100000031789145, 0.95033333381017049]
在这里你可以看到我得到的图:
Training and Validation accuracy and loss plots
我不确定这是否相关,但我使用的是 python 3.5 和 keras 2.0.4。
来自Keras FAQ:
Why is the training loss much higher than the testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.
Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
所以你看到的行为并不像阅读 ML 理论后看起来那么不寻常。这也解释了当您在同一模型上评估训练集和测试集时,您突然会得到预期的行为(train acc > val acc)。我猜想在您的情况下,dropout 的存在尤其会阻止准确度在训练期间达到 1.0,而在评估(测试)期间达到准确度。
您可以通过添加一个在每个时期保存您的模型的回调来进一步调查。然后,您可以使用两组评估每个已保存的模型以重新创建您的绘图。
训练准确度 > 验证准确度通常是在您的训练数据欠拟合时出现的。即使数据欠拟合,在验证数据比训练数据更适合您的模型的情况下,验证数据也可能表现良好。
增加更多的卷积层并放松 Dropout 层,减少 dropout 层或降低单元的百分比。
我正在尝试使用 mnist 数据集训练一个简单的神经网络。出于某种原因,当我获得历史记录(从 model.fit 返回的参数)时,验证精度高于训练精度,这真的很奇怪,但是如果我在评估模型时检查分数,我得到训练准确率高于测试准确率。
无论模型的参数如何,这种情况每次都会发生。另外,如果我使用自定义回调并访问参数'acc'和'val_acc',我发现同样的问题(数字与历史中返回的数字相同)。
请帮帮我!我究竟做错了什么?为什么验证准确率比训练准确率高(你看我在看loss的时候也有同样的问题)
这是我的代码:
#!/usr/bin/env python3.5
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np
from keras import backend
from keras.utils import np_utils
from keras import losses
from keras import optimizers
from keras.datasets import mnist
from keras.models import Sequential
from matplotlib import pyplot as plt
# get train and test data (minst) and reduce volume to speed up (for testing)
(x_train, y_train), (x_test, y_test) = mnist.load_data()
data_reduction = 20
x_train = x_train[:x_train.shape[0] // data_reduction]
y_train = y_train[:y_train.shape[0] // data_reduction]
x_test = x_test[:x_test.shape[0] // data_reduction]
y_test = y_test[:y_test.shape[0] // data_reduction]
try:
IMG_DEPTH = x_train.shape[3]
except IndexError:
IMG_DEPTH = 1 # B/W
labels = np.unique(y_train)
N_LABELS = len(labels)
# reshape input data
if backend.image_data_format() == 'channels_first':
X_train = x_train.reshape(x_train.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2])
X_test = x_test.reshape(x_test.shape[0], IMG_DEPTH, x_train.shape[1], x_train.shape[2])
input_shape = (IMG_DEPTH, x_train.shape[1], x_train.shape[2])
else:
X_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH)
X_test = x_test.reshape(x_test.shape[0], x_train.shape[1], x_train.shape[2], IMG_DEPTH)
input_shape = (x_train.shape[1], x_train.shape[2], IMG_DEPTH)
# convert data type to float32 and normalize data values to range [0, 1]
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
# reshape input labels
Y_train = np_utils.to_categorical(y_train, N_LABELS)
Y_test = np_utils.to_categorical(y_test, N_LABELS)
# create model
opt = optimizers.Adam()
loss = losses.categorical_crossentropy
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(labels), activation='softmax'))
model.compile(optimizer=optimizers.Adam(), loss=losses.categorical_crossentropy, metrics=['accuracy'])
# fit model
history = model.fit(X_train, Y_train, batch_size=64, epochs=50, verbose=True,
validation_data=(X_test, Y_test))
# evaluate model
train_score = model.evaluate(X_train, Y_train, verbose=True)
test_score = model.evaluate(X_test, Y_test, verbose=True)
print("Validation:", test_score[1])
print("Training: ", train_score[1])
print("--------------------")
print("First 5 samples validation:", history.history["val_acc"][0:5])
print("First 5 samples training:", history.history["acc"][0:5])
print("--------------------")
print("Last 5 samples validation:", history.history["val_acc"][-5:])
print("Last 5 samples training:", history.history["acc"][-5:])
# plot history
plt.ion()
fig = plt.figure()
subfig = fig.add_subplot(122)
subfig.plot(history.history['acc'], label="training")
if history.history['val_acc'] is not None:
subfig.plot(history.history['val_acc'], label="validation")
subfig.set_title('Model Accuracy')
subfig.set_xlabel('Epoch')
subfig.legend(loc='upper left')
subfig = fig.add_subplot(121)
subfig.plot(history.history['loss'], label="training")
if history.history['val_loss'] is not None:
subfig.plot(history.history['val_loss'], label="validation")
subfig.set_title('Model Loss')
subfig.set_xlabel('Epoch')
subfig.legend(loc='upper left')
plt.ioff()
input("Press ENTER to close the plots...")
我得到的输出如下:
Validation accuracy: 0.97599999999999998
Training accuracy: 1.0
--------------------
First 5 samples validation: [0.83400000286102294, 0.89200000095367427, 0.91599999904632567, 0.9279999976158142, 0.9399999990463257]
First 5 samples training: [0.47133333333333333, 0.70566666682561241, 0.76933333285649619, 0.81133333333333335, 0.82366666714350378]
--------------------
Last 5 samples validation: [0.9820000019073486, 0.9860000019073486, 0.97800000190734859, 0.98399999713897701, 0.975999997138977]
Last 5 samples training: [0.9540000001589457, 0.95766666698455816, 0.95600000031789145, 0.95100000031789145, 0.95033333381017049]
在这里你可以看到我得到的图: Training and Validation accuracy and loss plots
我不确定这是否相关,但我使用的是 python 3.5 和 keras 2.0.4。
来自Keras FAQ:
Why is the training loss much higher than the testing loss?
A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.
Besides, the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
所以你看到的行为并不像阅读 ML 理论后看起来那么不寻常。这也解释了当您在同一模型上评估训练集和测试集时,您突然会得到预期的行为(train acc > val acc)。我猜想在您的情况下,dropout 的存在尤其会阻止准确度在训练期间达到 1.0,而在评估(测试)期间达到准确度。
您可以通过添加一个在每个时期保存您的模型的回调来进一步调查。然后,您可以使用两组评估每个已保存的模型以重新创建您的绘图。
训练准确度 > 验证准确度通常是在您的训练数据欠拟合时出现的。即使数据欠拟合,在验证数据比训练数据更适合您的模型的情况下,验证数据也可能表现良好。
增加更多的卷积层并放松 Dropout 层,减少 dropout 层或降低单元的百分比。