如何使用 keras.utils.Sequence 数据生成器和 tf.distribute.MirroredStrategy 在 tensorflow 中进行多 GPU 模型训练?
How to use keras.utils.Sequence data generator with tf.distribute.MirroredStrategy for multi-gpu model training in tensorflow?
我想使用 tensorflow 2.0 在多个 GPU 上训练模型。在分布式训练的tensorflow教程(https://www.tensorflow.org/guide/distributed_training)中,tf.data
数据生成器转换为分布式数据集如下:
dist_dataset = mirrored_strategy.experimental_distribute_dataset(dataset)
但是,我想改用自己的自定义数据生成器(例如,keras.utils.Sequence
数据生成器,以及用于异步批处理生成的 keras.utils.data_utils.OrderedEnqueuer
)。但是 mirrored_strategy.experimental_distribute_dataset
方法仅支持 tf.data
数据生成器。我该如何改用 keras 数据生成器?
谢谢!
我在相同的情况下将 tf.data.Dataset.from_generator
与 keras.utils.sequence
一起使用,它解决了我的问题!
train_generator = SegmentationMultiGenerator(datasets, folder) # My keras.utils.sequence object
def generator():
multi_enqueuer = OrderedEnqueuer(train_generator, use_multiprocessing=True)
multi_enqueuer.start(workers=10, max_queue_size=10)
while True:
batch_xs, batch_ys, dset_index = next(multi_enqueuer.get()) # I have three outputs
yield batch_xs, batch_ys, dset_index
dataset = tf.data.Dataset.from_generator(generator,
output_types=(tf.float64, tf.float64, tf.int64),
output_shapes=(tf.TensorShape([None, None, None, None]),
tf.TensorShape([None, None, None, None]),
tf.TensorShape([None, None])))
strategy = tf.distribute.MirroredStrategy()
train_dist_dataset = strategy.experimental_distribute_dataset(dataset)
请注意,这是我的第一个可行解决方案 - 目前我发现将 'None' 放在实际输出形状的位置最方便,我发现它可以工作。
在不使用 Enqueuer 的情况下,这是另一种方法,假设您有一个生成器 dg,它在调用时以 (feature, label) 的形式生成样本:
import tensorflow as tf
import numpy as np
def get_tf_data_Dataset(data_generator_settings_dict):
length_req = data_generator_settings_dict["length"]
x_d1 = data_generator_settings_dict['x_d1']
x_d2 = data_generator_settings_dict['x_d2']
x_d3 = data_generator_settings_dict['x_d3']
y_d1 = data_generator_settings_dict['y_d1']
x_d2 = data_generator_settings_dict['x_d2']
y_d3 = data_generator_settings_dict['y_d3']
list_of_x_arrays = [np.zeros((x_d1, x_d2, x_d3)) for _ in range(length_req)]
list_of_y_arrays = [np.zeros((y_d1, y_d2, y_d3)) for _ in range(length_req)]
list_of_tuple_samples = [(x, y) for (x, y) in dg()]
list_of_x_samples = [x for (x, y) in list_of_tuple_samples]
list_of_y_samples = [y for (x, y) in list_of_tuple_samples]
for sample_index in range(length_req):
list_of_x[sample_index][:] = list_of_x_samples[sample_index]
list_of_y[sample_index][:] = list_of_y_samples[sample_index]
return tf.data.Dataset.from_tensor_slices((list_of_x, list_of_y))
它很复杂,但保证有效。这也意味着 dg 的 __call__
方法是一个 for 循环(当然在 __init__
之后):
def __call__(self):
for _ in self.length:
# generate x (single sample of feature)
# generate y (single matching sample of label)
yield x, y
我想使用 tensorflow 2.0 在多个 GPU 上训练模型。在分布式训练的tensorflow教程(https://www.tensorflow.org/guide/distributed_training)中,tf.data
数据生成器转换为分布式数据集如下:
dist_dataset = mirrored_strategy.experimental_distribute_dataset(dataset)
但是,我想改用自己的自定义数据生成器(例如,keras.utils.Sequence
数据生成器,以及用于异步批处理生成的 keras.utils.data_utils.OrderedEnqueuer
)。但是 mirrored_strategy.experimental_distribute_dataset
方法仅支持 tf.data
数据生成器。我该如何改用 keras 数据生成器?
谢谢!
我在相同的情况下将 tf.data.Dataset.from_generator
与 keras.utils.sequence
一起使用,它解决了我的问题!
train_generator = SegmentationMultiGenerator(datasets, folder) # My keras.utils.sequence object
def generator():
multi_enqueuer = OrderedEnqueuer(train_generator, use_multiprocessing=True)
multi_enqueuer.start(workers=10, max_queue_size=10)
while True:
batch_xs, batch_ys, dset_index = next(multi_enqueuer.get()) # I have three outputs
yield batch_xs, batch_ys, dset_index
dataset = tf.data.Dataset.from_generator(generator,
output_types=(tf.float64, tf.float64, tf.int64),
output_shapes=(tf.TensorShape([None, None, None, None]),
tf.TensorShape([None, None, None, None]),
tf.TensorShape([None, None])))
strategy = tf.distribute.MirroredStrategy()
train_dist_dataset = strategy.experimental_distribute_dataset(dataset)
请注意,这是我的第一个可行解决方案 - 目前我发现将 'None' 放在实际输出形状的位置最方便,我发现它可以工作。
在不使用 Enqueuer 的情况下,这是另一种方法,假设您有一个生成器 dg,它在调用时以 (feature, label) 的形式生成样本:
import tensorflow as tf
import numpy as np
def get_tf_data_Dataset(data_generator_settings_dict):
length_req = data_generator_settings_dict["length"]
x_d1 = data_generator_settings_dict['x_d1']
x_d2 = data_generator_settings_dict['x_d2']
x_d3 = data_generator_settings_dict['x_d3']
y_d1 = data_generator_settings_dict['y_d1']
x_d2 = data_generator_settings_dict['x_d2']
y_d3 = data_generator_settings_dict['y_d3']
list_of_x_arrays = [np.zeros((x_d1, x_d2, x_d3)) for _ in range(length_req)]
list_of_y_arrays = [np.zeros((y_d1, y_d2, y_d3)) for _ in range(length_req)]
list_of_tuple_samples = [(x, y) for (x, y) in dg()]
list_of_x_samples = [x for (x, y) in list_of_tuple_samples]
list_of_y_samples = [y for (x, y) in list_of_tuple_samples]
for sample_index in range(length_req):
list_of_x[sample_index][:] = list_of_x_samples[sample_index]
list_of_y[sample_index][:] = list_of_y_samples[sample_index]
return tf.data.Dataset.from_tensor_slices((list_of_x, list_of_y))
它很复杂,但保证有效。这也意味着 dg 的 __call__
方法是一个 for 循环(当然在 __init__
之后):
def __call__(self):
for _ in self.length:
# generate x (single sample of feature)
# generate y (single matching sample of label)
yield x, y