对于使用数据增强进行图像分类的卷积神经网络,如何在 keras 中获得可重现的结果?

How can I get reproducible results in keras for a convolutional neural network using data augmentation for image classification?

如果我两次训练相同的卷积神经网络模型架构(在相同的数据上),清除运行之间的会话,我会得到不同的结果。

我设置了随机种子和线程配置如下:

import numpy as np
from numpy.random import seed
import pandas as pd
import random as rn
import os

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)

import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.compat.v1.keras import backend as K

session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
tf.random.set_seed(seed_num)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

...并且我在 运行 flow_from_directory:

时指定了种子
train_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                                 validation_split=0.1,
                                                                 rotation_range=45)

train_img = train_data_gen_aug_rotate.flow_from_directory(data_path, 
                                               subset='training',
                                               color_mode='rgb', 
                                               target_size=target_size,
                                               batch_size=batch_size, 
                                               class_mode='categorical',
                                               seed=seed_num)

其他有助于回答问题的信息:

模型架构:

inputs = tf.keras.layers.Input(shape=num_pixels_and_channels)
conv = tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding='SAME', activation='relu')(inputs)
pool = tf.keras.layers.AveragePooling2D(pool_size=(2,2), strides=(2,2), padding='SAME')(conv)
batnorm = tf.keras.layers.BatchNormalization()(pool)
flattened = tf.keras.layers.Flatten()(batnorm)
dense = tf.keras.layers.Dense(257)(flattened)
outputs = tf.keras.layers.Softmax()(dense)

my_model = tf.keras.Model(inputs, outputs)

我正在编译模型如下:

model_name.compile(loss='categorical_crossentropy',  
                optimizer=tf.keras.optimizers.Adam(0.001),
                metrics=['accuracy']
               )

我正在使用 train_on_batchtest_on_batch:

# get next batch of images & labels
X_imgs, X_labels = next(train_img) 

#train model, get cross entropy & accuracy for batch
train_CE, train_acc = model_name.train_on_batch(X_imgs, X_labels)

# validation images - just predict
X_imgs_val, X_labels_val = next(val_img)
val_CE, val_acc = model_name.test_on_batch(X_imgs_val, X_labels_val)

我正在清除 运行 每个模型之间的会话,tf.keras.backend.clear_session()

我正在使用 tensorflow 版本 2.1.0 在 Jupyter Notebook 中使用单个 CPU 开发 mac。

到目前为止(在撰写本文时)提供的唯一答案是 "yes"。

我还需要做什么才能为相同的模型架构获得相同的结果?是否有与 ImageDataGeneratortrain_on_batchtest_on_batch 或不使用我设置的 tensorflow 种子的 Adam 优化器相关的随机种子?或者代码的另一部分需要单独指定种子?

我现在得到了可重现的结果(每个实验的初始随机权重相同,从而确保结果的任何差异是由于实验之间的差异而不是由于不同的初始权重):

1) 在每次实验前清除会话并设置随机种子和 tf 会话配置:

tf.keras.backend.clear_session()

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)
tf.random.set_seed(seed_num)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

2) 运行 ImageDataGeneratorflow_from_directory 在每次实验之前再次编码(以确保在开始训练时都从种子的随机数序列的开头开始下一个模型)

所以我从笔记本开始到第一个实验的代码是:

import numpy as np
from numpy.random import seed
import pandas as pd
import random as rn
import os

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)

import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.compat.v1.keras import backend as K

session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
tf.random.set_seed(seed_num)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

...然后对于每个实验,在模型架构定义和编译模型之前,它是:

tf.keras.backend.clear_session()

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)
tf.random.set_seed(seed_num)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)


batch_size = 80  
target_size=(64,64) 
num_pixels_and_channels = (64,64,3) 

train_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                                 validation_split=0.1,
                                                                 rotation_range=45)

train_img = train_data_gen_aug_rotate.flow_from_directory(data_path, 
                                               subset='training',
                                               color_mode='rgb', 
                                               target_size=target_size,
                                               batch_size=batch_size, 
                                               class_mode='categorical',
                                               seed=seed_num)

val_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                               validation_split=0.1)


val_img = val_data_gen_aug_rotate.flow_from_directory(data_path, 
                                           subset='validation',
                                           color_mode='rgb',
                                           target_size=target_size,
                                           batch_size=batch_size,
                                           class_mode='categorical',
                                           seed=seed_num)

我不知道这是否矫枉过正,还有更有效的方法,但这对我在笔记本电脑上使用单个 CPU 很有效。 (运行 在 GPU 上引入了额外的可变性)