连接 2 个用于模型训练的 Tensorflow 数据集

Question

我有 2 个包含 (384,384) 张图像的 tensorflow 数据集，我想用作 model.train()

的输入数据和标签数据

data = tf.keras.preprocessing.image_dataset_from_directory('path1', labels=None, image_size=(384,384), batch_size=1)
labels = tf.keras.preprocessing.image_dataset_from_directory('path2', labels=None, image_size=(384,384), batch_size=1)

但它不允许我将 x 和 y 作为数据集传递。

model.train(data, labels, epochs=5)

ValueError: y argument is not supported when using dataset as input.

在这种情况下我能做什么？

Answer 1

此错误表明您的第一个参数 (data) 包括数据和标签（一个元组），并且 model.fit() 不希望得到另一个参数 y，因为您指定了 labels.

基于此doc：如果您不指定 label_mode 参数，则默认为 int。那么，这个对象returns是什么：

如果 label_mode 是 None，它会产生 float32 个形状为 (batch_size, image_size[0], image_size[1], num_channels) 的张量。
否则，它会产生一个元组 (images, labels)，其中图像的形状为 (batch_size, image_size[0], image_size[1], num_channels)，如果 label_mode 为 int，则标签为 int32 ] 形状为 (batch_size,).

推理：

只需将另一个参数作为 label_mode 传递并将其设置为 None，如下所示：

data = tf.keras.preprocessing.image_dataset_from_directory('path1', labels=None, label_mode=None, image_size=(384,384), batch_size=1)
labels = tf.keras.preprocessing.image_dataset_from_directory('path2', labels=None, label_mode=None, image_size=(384,384), batch_size=1)

连接 2 个用于模型训练的 Tensorflow 数据集

Concatenate 2 Tensorflow dataset for model training

tensorflow

deep-learning

tensorflow-datasets