使用用于语义分割任务的索引颜色值读取张量流中的图像

Question

我正在为语义分割创建 FCN。我很难将标记的 png 图像转换为 PascalVOC 数据集上的索引颜色值。我希望值在 0 到 20 之间。因为我可以在下面的代码中使用 PIL 实现这样的操作

with Image.open(image_path) as img:
    label = np.array(img)

它输出我想要的。但是对于 tensorflow 实现，我希望它与下面的代码具有相同的值

file = tf.read_file(image_path)
label = tf.image.decode_png(file, channels=0)

但是 tensorflow 实现的结果是 0 到 255 之间的值。有什么方法可以在 tensorflow 中实现 PIL 实现吗？谢谢。

Answer 1

SegmentationClass 文件中包含颜色映射组件，因此在使用 tf.decode_png() 时，您需要指定为：

label = tf.image.decode_png(file, channels=3)

现在您已经获得了 RGB 值，您可以使用 create_pascal_label_colormap() 函数将其转换为 class ID。

Answer 2

假设您在成对的数据集中有 VOC 2012 文件路径 (image_path: tf.string, image_label_path: tf.string)。

首先定义每个class的颜色：

_COLORS = tf.constant([
  [0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0], [0, 0, 128],
  [128, 0, 128], [0, 128, 128], [128, 128, 128], [64, 0, 0], [192, 0, 0],
  [64, 128, 0], [192, 128, 0], [64, 0, 128], [192, 0, 128], [64, 128, 128],
  [192, 128, 128], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0],
  [0, 64, 128]
], dtype=tf.int32)

加载函数：

def load_images(image_path, label_path):
  image = tf.io.read_file(image_path)
  image = tf.io.decode_jpeg(image, channels=3)

  image_label = tf.io.read_file(label_path)
  image_label = tf.io.decode_png(image_label, channels=3)
  image_label = rgb_to_label(image_label)

  return image, image_label

def rgb_to_label(segm):
  segm = tf.cast(segm[..., tf.newaxis], _COLORS.dtype)
  return tf.argmax(tf.reduce_all(segm == tf.transpose(_COLORS), axis=-2), axis=-1)

然后应用函数：

ds = ds.map(load_images, num_parallel_calls=tf.data.AUTOTUNE)

解释：

segm[..., tf.newaxis].shape == (H, W, 3, 1)，而
transpose(colors).shape == (3, 21).

将两者与 == 进行比较，得到一个形状为 (H, W, 3, 21) 的张量。如果分割掩码图像的像素 (h,w) 与某个 class c 的颜色匹配，则第 3 轴上的所有三个像素强度将匹配。因此，tf.reduce_all(...) 将减少为 (H, W, 21) one-hot 编码张量，在相应的标签索引处包含 true，在其他任何地方包含 false。

最后，tf.argmax(..., axis=-1) 为每个像素找到索引本身（生成 (H, W) 图像）。

值得一提的是，如果全部为false，argmax会默认为0。因此，包含未知颜色（地图中不存在的颜色 _COLORS）的像素将被分配给标签 0（背景）。

使用用于语义分割任务的索引颜色值读取张量流中的图像

Read image in tensorflow with indexed color value for semantic segmentation task

python

tensorflow

semantic-segmentation