在每个训练时期后评估测试集上的模型

Evaluate model on Testing Set after each epoch of training

我正在为分类任务在图像数据集上训练张量流模型,我们通常将训练集和验证集提供给model.fit方法,我们稍后可以输出训练和验证的模型收敛图。我想对测试集做同样的事情,换句话说,我想在每个时代之后获得我的模型在测试集上的准确性和损失(不是验证集 - 我不能用测试替换验证集设置因为我需要他们两个的图表)。

我设法做到了这一点,方法是在每个时期之后使用一些回调保存我的模型的检查点,然后将每个检查点加载到我的模型并计算准确性和损失,但我想知道是否存在更简单的方法那,也许有一些其他回调或一些解决方法 model.fit 方法。

您可以使用自定义 Callback 并传递您的测试数据并做任何您喜欢的事情:

import tensorflow as tf
import pathlib
import numpy as np

dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

batch_size = 5

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  seed=123,
  image_size=(64, 64),
  batch_size=batch_size)

test_ds = train_ds.take(30)

model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255, input_shape=(64, 64, 3)),
  tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(5)
])

class TestCallback(tf.keras.callbacks.Callback):
    def __init__(self, test_dataset):
        super().__init__()
        self.test_dataset = test_dataset
        self.test_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()
        self.loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) 

    def on_epoch_end(self, epoch, logs=None):
        losses = []
        for x_batch_test, y_batch_test in self.test_dataset:
          test_logits = self.model(x_batch_test, training=False)
          losses.append(self.loss_fn(y_batch_test, test_logits))
          self.test_acc_metric.update_state(y_batch_test, test_logits)
        test_acc = self.test_acc_metric.result()
        self.test_acc_metric.reset_states()
        logs['test_loss'] = tf.reduce_mean(tf.stack(losses))  # not sure if the reduction is correct
        logs['test_sparse_categorical_accuracy'] = test_acc

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) 
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=tf.keras.metrics.SparseCategoricalAccuracy())
epochs = 5
history = model.fit(train_ds, epochs=epochs, callbacks= [TestCallback(test_ds)])
Found 3670 files belonging to 5 classes.
Epoch 1/5
734/734 [==============================] - 14s 17ms/step - loss: 1.2709 - sparse_categorical_accuracy: 0.4591 - test_loss: 1.0020 - test_sparse_categorical_accuracy: 0.5533
Epoch 2/5
734/734 [==============================] - 13s 18ms/step - loss: 0.9574 - sparse_categorical_accuracy: 0.6275 - test_loss: 0.8348 - test_sparse_categorical_accuracy: 0.6467
Epoch 3/5
734/734 [==============================] - 9s 12ms/step - loss: 0.8136 - sparse_categorical_accuracy: 0.6733 - test_loss: 0.8379 - test_sparse_categorical_accuracy: 0.6467
Epoch 4/5
734/734 [==============================] - 8s 11ms/step - loss: 0.6970 - sparse_categorical_accuracy: 0.7357 - test_loss: 0.5713 - test_sparse_categorical_accuracy: 0.7533
Epoch 5/5
734/734 [==============================] - 8s 11ms/step - loss: 0.5793 - sparse_categorical_accuracy: 0.7834 - test_loss: 0.5656 - test_sparse_categorical_accuracy: 0.7733

您也可以只在回调中使用 model.evaluate。另见 .