通过 tf.image.convert_image_dtype 函数进行图像归一化

Question

根据文档 tf.image.convert_image_dtype“使用浮点值表示的图像应具有 [0,1] 范围内的值。”

但是在keras教程中(https://keras.io/examples/vision/cutmix/)我看到了下面的预处理函数：

def preprocess_image(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    image = tf.image.convert_image_dtype(image, tf.float32) / 255.0
    return image, label

我的问题是：为什么他们除以 255，而 tf.image.convert_image_dtype 已经完成了这项工作？

Answer 1

当使用 convert_image_dtype(image, tf.float32) 时，仅将图像中的数字类型转换为 float32 并且不放置 [0,1) 但是当您除以 255.0 时，您将数字移动到 [ 0,1) 我们为 Convolutional Layers.

执行此操作

import tensorflow_datasets as tfds
import tensorflow as tf

dataset = tfds.load('cifar10', as_supervised=True, split='train').batch(1)

for image, label in dataset.take(1):
    print(image[0])
    
IMG_SIZE = 64
def preprocess_image(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    image = tf.image.convert_image_dtype(image, tf.float32) / 255.0
    # or
    # image = tf.cast(image, tf.float32) / 255.0
    return image, label

dataset = dataset.map(preprocess_image)
for image, label in dataset.take(1):
    print(image[0])

输出：

tf.Tensor(
[[[143  96  70]
  [141  96  72]
  [135  93  72]
  ...
  [212 177 147]
  [219 185 155]
  [221 187 157]]], shape=(32, 32, 3), dtype=uint8)


tf.Tensor(
[[[0.56078434 0.3764706  0.27450982]
  [0.5588235  0.3764706  0.2764706 ]
  [0.55490196 0.3764706  0.28039217]
  ...
  [0.8607843  0.72745097 0.6098039 ]
  [0.86470586 0.73137254 0.6137255 ]
  [0.8666667  0.73333335 0.6156863 ]]], shape=(64, 64, 3), dtype=float32)

通过 tf.image.convert_image_dtype 函数进行图像归一化

Image normalization by tf.image.convert_image_dtype function

python

image-processing

keras

tensorflow