在 Tensorflow 中的图像中应用转换模型(数据增强)

Apply a transformation model (data augmentation) in images in Tensorflow

我是 Tensorflow 中一些顺序模型的新手 Python。我有一个如下所示的转换顺序模型。它随机应用于给定的图像输入一些具有随机参数的操作。

import tensorflow as tf
from tensorflow.keras import layers

data_transformation = tf.keras.Sequential(
    [
        layers.Lambda(lambda x: my_random_brightness(x, 1, 20)))
        layers.GaussianNoise(stddev=tf.random.uniform(shape=(), minval=0, maxval=1)),
        layers.experimental.preprocessing.RandomRotation(
            factor=0.01,
            fill_mode="reflect",
            interpolation="bilinear",
            seed=None,
            name=None,
            fill_value=0.0,
        ),
        layers.experimental.preprocessing.RandomZoom(
            height_factor=(0.1, 0.2),
            width_factor=(0.1, 0.2),
            fill_mode="reflect",
            interpolation="bilinear",
            seed=None,
            name=None,
            fill_value=0.0,
        ),
    ]
)

此模型中还有一个 lambda 函数,定义如下

def my_random_brightness(
    image_to_be_transformed, brightness_factor_min, brightness_factor_max
):

    # build the brightness factor
    selected_brightness_factor = tf.random.uniform(
        (), minval=brightness_factor_min, maxval=brightness_factor_max
    )

    c0 = image_to_be_transformed[:, :, :, 0] + selected_brightness_factor
    c1 = image_to_be_transformed[:, :, :, 1] + selected_brightness_factor
    c2 = image_to_be_transformed[:, :, :, 2] + selected_brightness_factor

    image_to_be_transformed = tf.concat(
        [c0[..., tf.newaxis], image_to_be_transformed[:, :, :, 1:]], axis=-1
    )

    image_to_be_transformed = tf.concat(
        [
            image_to_be_transformed[:, :, :, 0][..., tf.newaxis],
            c1[..., tf.newaxis],
            image_to_be_transformed[:, :, :, 2][..., tf.newaxis],
        ],
        axis=-1,
    )

    image_to_be_transformed = tf.concat(
        [image_to_be_transformed[:, :, :, :2], c2[..., tf.newaxis]], axis=-1
    )

    return image_to_be_transformed

刚才假设我想应用这样的模型在只包含一张图像的一批中输出这样的随机操作,我想查看并保存结果。这怎么可能呢?是否有类似 predict() 或 flow() 的函数可以输出这样的结果?

编辑:我试过 result=data_transformation(image),但出现以下错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Index out of range using input dim 3; input has only 3 dims [Op:StridedSlice] name: sequential/lambda/strided_slice/

除了亮度处理层(上图)的正确性之外,它被编码为拍摄一批图像而不是单个图像。这就是它给出预期错误的原因。在这种情况下,您应该在传递单个图像时添加批处理轴。应该可以。

result=data_transformation(image[None, ...])

另外,在自定义层实现中,尽量采用子类化的方式。

class MyCustomBrightNess(layers.Layer):
    def __init__(self, pbrightness_factor_min, brightness_factor_max, **kwargs):
        super().__init__(**kwargs)
        self.brightness_factor_max = brightness_factor_max
        self.pbrightness_factor_min = pbrightness_factor_min
        
    def call(self, inputs):
         # build the brightness factor
      selected_brightness_factor = tf.random.uniform(
         (), minval=self.brightness_factor_min, maxval=self.brightness_factor_max
      )

      c0 = inputs[:, :, :, 0] + selected_brightness_factor
      c1 = inputs[:, :, :, 1] + selected_brightness_factor
      c2 = inputs[:, :, :, 2] + selected_brightness_factor

      inputs = tf.concat(
         [c0[..., tf.newaxis], inputs[:, :, :, 1:]], axis=-1
      )

      inputs = tf.concat(
         [
               inputs[:, :, :, 0][..., tf.newaxis],
               c1[..., tf.newaxis],
               inputs[:, :, :, 2][..., tf.newaxis],
         ],
         axis=-1,
      )

      inputs = tf.concat(
         [inputs[:, :, :, :2], c2[..., tf.newaxis]], axis=-1
      )

      return inputs
        
    def get_config(self):
        config = {
            'pbrightness_factor_min': self.pbrightness_factor_min,
            'brightness_factor_max': self.brightness_factor_max
        }
        base_config = super(MyCustomBrightNess, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

关于这个实现的正确性,我没有仔细核对过。我建议使用 random_brightness or adjust_brightness from the official implementation. Or if you're using tensorflow2.9, say hello to the new KerasCV,在那里我们可以找到 RandomBrightness 层。