tensorflow_dataset 使用 dataset.map 进行图像变换

Question

我正在尝试使用 Python 中的 tesorflow_dataset 库加载 cifar100 dataset。使用 .load() 加载数据后，我将尝试使用 .map() 将图像转换为设定大小，地图内部的 lambda 给我

TypeError: () missing 2 required positional arguments: 'coarse_label' and 'label'

当运行我的代码。

在将标签信息保留在数据中的同时转换这些图像的最佳方法是什么？我不太确定 lambda 函数如何与数据集交互。

这是通过 tensorflow 2.0.0b1、tensorflow-datasets 1.0.2 和 Python 3.7.3

完成的

def transform_images(x_train, size):
    x_train = tf.image.resize(x_train, (size, size))
    x_train = x_train / 255
    return x_train

train_dataset = tfds.load(name="cifar100", split=tfds.Split.TRAIN)
train_dataset = train_dataset.map(lambda image, coarse_label, label: 
        (dataset.transform_images(image, FLAGS.size), coarse_label, label))

Answer 1

你的 train_dataset 的每一行都是一个字典，而不是一个元组。所以你不能像 lambda image, coarse_label, label.

这样使用 lambda

import tensorflow as tf
import tensorflow_datasets as tfds

train_dataset = tfds.load(name="cifar100", split=tfds.Split.TRAIN)
print(train_dataset.output_shapes)

# {'image': TensorShape([32, 32, 3]), 'label': TensorShape([]), 'coarse_label': TensorShape([])}

你应该像下面这样使用它：

def transform_images(row, size):
    x_train = tf.image.resize(row['image'], (size, size))
    x_train = x_train  / 255
    return x_train, row['coarse_label'], row['label']

train_dataset = train_dataset.map(lambda row:transform_images(row, 16))
print(train_dataset.output_shapes)

# (TensorShape([16, 16, 3]), TensorShape([]), TensorShape([]))

tensorflow_dataset 使用 dataset.map 进行图像变换

tensorflow_dataset image transform with dataset.map

tensorflow

tensorflow-datasets