在 Tensorflow 中训练多个模型时如何避免过多的内存使用
How to avoid excessive memory usage while training multiple models in Tensorflow
我目前正在编写一段代码,旨在解释应用不同的丢失率如何影响跨多个数据集的通用 CNN 模型的性能。
我已经设置好了,对于每个数据集,我训练了 10 个不同的模型(具有 10 个不同的辍学率)总共 3 次,并记录每个 运行 和辍学的准确性。希望这个数据框能更好地解释我所说的内容:
代码看起来像这样:
for i, dataset in tqdm(enumerate(datasets)):
dataset_path = pathlib.Path(args.input_folder) / dataset
ds_train, ds_test, ds_validation = loader.get_image_data_generators(dataset_path, BATCH_SIZE)
CLASS_NAMES = list(ds_train.class_indices.keys())
INPUT_SHAPE = ds_train.next()[0].shape[1:]
OUTPUT_SIZE = ds_train.next()[1].shape[1]
ds_train_size, ds_test_size, ds_validation_size = loader.get_split_sizes(dataset_path)
performance = {'dataset': [], 'dropout_rate': []}
# Pre-fill dictionary with dataset and dropout labels.
performance['dataset'] = [dataset for i in range(DROPOUT_STEPS)]
performance['dropout_rate'] = [i/DROPOUT_STEPS for i in range(DROPOUT_STEPS)]
for run_i in range(RUNS):
performance[f'run_{run_i}_acc'] = []
for i in range(DROPOUT_STEPS):
# Compute dropout rate.
dropout_rate = i / DROPOUT_STEPS
# Initialize model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=INPUT_SHAPE))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(dropout_rate))
model.add(layers.Dense(OUTPUT_SIZE, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(
x=ds_train,
epochs=EPOCHS,
steps_per_epoch=math.ceil(ds_train_size / BATCH_SIZE),
validation_data=ds_validation,
validation_steps=math.ceil(ds_validation_size / BATCH_SIZE)
)
test_loss, test_accuracy = model.evaluate(ds_test, steps=math.ceil(ds_test_size / BATCH_SIZE))
performance[f'run_{run_i}_acc'].append(test_accuracy)
print(f'✔️ Dropout rate {dropout_rate} resulted on {test_accuracy}')
df = pd.DataFrame(performance)
print(df)
df.to_pickle(f'output/performance/{dataset}-perf.pkl')
在一些(较小的)数据集中,这 运行 很顺利。在较大的数据集中,我的计算机的内存使用率缓慢上升,在某些情况下,整个过程会在第二秒停止 运行,抱怨没有足够的可用内存。
我将如何优化这段代码,避免过多的内存使用? Tensorflow 是否在 运行s 甚至 dropout 步骤之间迭代时保存任何临时文件?如果是这样,我如何在每个循环周期重置内存?
感谢您的帮助。
在每个模型训练后使用 tf.keras.backend.clear_session()
清除内存。 Keras documentation 声明如下:
If you are creating many models in a loop, this global state will consume an increasing amount of memory over time, and you may want to clear it. Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.
我目前正在编写一段代码,旨在解释应用不同的丢失率如何影响跨多个数据集的通用 CNN 模型的性能。
我已经设置好了,对于每个数据集,我训练了 10 个不同的模型(具有 10 个不同的辍学率)总共 3 次,并记录每个 运行 和辍学的准确性。希望这个数据框能更好地解释我所说的内容:
代码看起来像这样:
for i, dataset in tqdm(enumerate(datasets)):
dataset_path = pathlib.Path(args.input_folder) / dataset
ds_train, ds_test, ds_validation = loader.get_image_data_generators(dataset_path, BATCH_SIZE)
CLASS_NAMES = list(ds_train.class_indices.keys())
INPUT_SHAPE = ds_train.next()[0].shape[1:]
OUTPUT_SIZE = ds_train.next()[1].shape[1]
ds_train_size, ds_test_size, ds_validation_size = loader.get_split_sizes(dataset_path)
performance = {'dataset': [], 'dropout_rate': []}
# Pre-fill dictionary with dataset and dropout labels.
performance['dataset'] = [dataset for i in range(DROPOUT_STEPS)]
performance['dropout_rate'] = [i/DROPOUT_STEPS for i in range(DROPOUT_STEPS)]
for run_i in range(RUNS):
performance[f'run_{run_i}_acc'] = []
for i in range(DROPOUT_STEPS):
# Compute dropout rate.
dropout_rate = i / DROPOUT_STEPS
# Initialize model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=INPUT_SHAPE))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(dropout_rate))
model.add(layers.Dense(OUTPUT_SIZE, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(
x=ds_train,
epochs=EPOCHS,
steps_per_epoch=math.ceil(ds_train_size / BATCH_SIZE),
validation_data=ds_validation,
validation_steps=math.ceil(ds_validation_size / BATCH_SIZE)
)
test_loss, test_accuracy = model.evaluate(ds_test, steps=math.ceil(ds_test_size / BATCH_SIZE))
performance[f'run_{run_i}_acc'].append(test_accuracy)
print(f'✔️ Dropout rate {dropout_rate} resulted on {test_accuracy}')
df = pd.DataFrame(performance)
print(df)
df.to_pickle(f'output/performance/{dataset}-perf.pkl')
在一些(较小的)数据集中,这 运行 很顺利。在较大的数据集中,我的计算机的内存使用率缓慢上升,在某些情况下,整个过程会在第二秒停止 运行,抱怨没有足够的可用内存。
我将如何优化这段代码,避免过多的内存使用? Tensorflow 是否在 运行s 甚至 dropout 步骤之间迭代时保存任何临时文件?如果是这样,我如何在每个循环周期重置内存?
感谢您的帮助。
在每个模型训练后使用 tf.keras.backend.clear_session()
清除内存。 Keras documentation 声明如下:
If you are creating many models in a loop, this global state will consume an increasing amount of memory over time, and you may want to clear it. Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.