将 DatasetV1Adapter shape 灰度图像形状扩展到 3 个通道以使用预训练模型

Question

我想使用预训练模型 MobileNetV2 对 Binary Alpha Data 进行分类。但是，此数据的形状为 (20, 16, 1)（灰度一个通道），而不是按需 (20, 16, 3)（3 个 RGB 通道）。实际上我还必须调整大小，因为 20x16 既不是有效输入，但我知道该怎么做。所以我的问题是如何将 1 通道 grescale (DatasetV1Adapter) 转换为 3 通道？

到目前为止我的代码：

import tensorflow as tf
import os
import PIL
import numpy as np
import matplotlib.pyplot as plt
import tensorflow_datasets as tfds
import tensorflow_hub as hub

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import keras_preprocessing
from keras_preprocessing import image

from tensorflow.python.keras.utils.version_utils import training
from tensorflow.keras.optimizers import RMSprop
(raw_train, raw_test, raw_validation), metadata = tfds.load(
    'binary_alpha_digits',
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True,
    as_supervised=True,
)
IMG_SIZE = 96

def format_example(image, label):
  image = tf.cast(image, tf.float32)
  image = image*1/255.0
  image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
  return image, label

train = raw_train.map(format_example)
validation = raw_validation.map(format_example)
test = raw_test.map(format_example)

BATCH_SIZE = 32
SHUFFLE_BUFFER_SIZE = 1000

train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)
validation_batches = validation.batch(BATCH_SIZE)

当我检查 train_batches 时，我得到输出：

<DatasetV1Adapter shapes: ((None, 96, 96, 1), (None,)), types: (tf.float32, tf.int64)>

稍后尝试适应时出现错误：

ValueError: The input must have 3 channels; got `input_shape=(96, 96, 1)`

所以根据 post 我试过：

def load_image_into_numpy_array(image):
    # The function supports only grayscale images
    # assert len(image.shape) == 2, "Not a grayscale input image" 
    last_axis = -1
    dim_to_repeat = 2
    repeats = 3
    grscale_img_3dims = np.expand_dims(image, last_axis)
    training_image = np.repeat(grscale_img_3dims, repeats, dim_to_repeat).astype('uint8')
    assert len(training_image.shape) == 3
    assert training_image.shape[-1] == 3
    return training_image

train_mod=load_image_into_numpy_array(raw_train)

但是我得到一个错误：

AxisError: axis 2 is out of bounds for array of dimension 1

我怎样才能将它放入 input_shape=(96, 96, 3)？

Answer 1

我愿意帮忙，你可以试试APIgrayscale_to_rgb。您将不得不使用 map API 来转换您的数据集。我在 google colab 上试过你的代码，下面的代码有效。

在您的代码中，您可以进行以下更改。

def load_image_into_numpy_array(image):
    # The function supports only grayscale images
    # assert len(image.shape) == 2, "Not a grayscale input image" 
    last_axis = -1
    dim_to_repeat = 2
    repeats = 3
    # this way you can change the image to rgb.
    training_image = tf.image.grayscale_to_rgb(image)
    assert len(training_image.shape) == 3
    assert training_image.shape[-1] == 3
    return training_image

然后您将必须像下面这样在 dataset 上调用地图函数。

raw_train = raw_train.map(lambda x,y: load_image_into_numpy_array(x))

然后你可以查看结果是否接收到三通道(RGB)。

for i in raw_train.take(1):
  print(i.shape)

希望我的回答对您有所帮助。

Answer 2

首先，请注意 MobileNetV2 已经在 8 位彩色图像（即通常称为 RGB）上进行了训练，而您尝试用于微调的数据集由 1 位单色图像（即二值图像）组成).因此，这不一定会产生良好的结果，这可能是由于数据格式的差异。

也就是说，要通过重复通道维度来实现转换，您可以简单地在map映射函数（即format_example）中使用tf.repeat函数：

image = tf.repeat(image, repeats=[3], axis=-1)

Answer 3

单看应该不会

def format_example(image, label):
  image = tf.cast(image, tf.float32)
  image = image*1/255.0
  image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
  image = tf.image.grayscale_to_rgb(image)
  return image, label

是最简单的解决方案。

将 DatasetV1Adapter shape 灰度图像形状扩展到 3 个通道以使用预训练模型

Expand DatasetV1Adapter shape grey scale image shape to 3 channels to make use of pretrained models

python

rgb

image-processing

tensorflow

tensorflow-datasets