如何为图像创建双输入 TPU 模型?

How to create a bi-input TPU model for images?

我想将我的 GPU 模型转换为 TPU 模型。我的 GPU 模型采用两个输入图像,并且两个图像的输出相同。我为此使用自定义数据生成器。有两个并行网络;每个输入一个。

从这个开始,我试图解决这个问题但我失败了。 这是我试过的

dataset_12 = tf.data.Dataset.from_tensor_slices((left_train_paths, right_train_paths))
dataset_label = tf.data.Dataset.from_tensor_slices(train_labels) 
dataset = tf.data.Dataset.zip((dataset_12, dataset_label)).batch(2).repeat()

我面临的问题是我无法解码双输入图像。 这是解码器函数

def decode_image(filename, label=None, image_size=(IMG_SIZE_h, IMG_SIZE_w)):
    bits = tf.io.read_file(filename)
    image = tf.image.decode_jpeg(bits, channels=3)
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.image.resize(image, image_size)
    
    #convert to numpy and do some cv2 staff mb?
    
    if label is None:
        return image
    else:
        return image, label

问题是我无法同时将两个图像传递给解码器函数。我该如何解决这个问题?

我也尝试用下面的方式解码图片

 def decode(img,image_size=(IMG_SIZE_h, IMG_SIZE_w)):
    bits = tf.io.read_file(img)
    image = tf.image.decode_jpeg(bits, channels=3)
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.image.resize(image, image_size)
    return image
def decode_image(left, right,labels=None ):
    if labels is None:
        return decode(left),decode(right)
    else:
        return decode(left),decode(right),labels 
    
image=tf.data.Dataset.from_tensor_slices((left_train_paths,right_train_paths,train_labels ))
dataset=image.map(decode_image, num_parallel_calls=AUTO).repeat().shuffle(512).batch(BATCH_SIZE).prefetch(AUTO)
dataset

dataset 变量的输出现在为 <PrefetchDataset shapes: ((None, 760, 760, 3), (None, 760, 760, 3), (None, 8)), types: (tf.float32, tf.float32, tf.int64)>

现在如何传递给模型?

型号

def get_model():
    
    left_tensor = Input(shape=(IMG_SIZE_h,IMG_SIZE_w,3))
    right_tensor = Input(shape=(IMG_SIZE_h,IMG_SIZE_w,3))

    left_model =  EfficientNetB3(input_shape =  (img_shape,img_shape,3), include_top = False, weights = 'imagenet',input_tensor=left_tensor)
    right_model = EfficientNetB3(input_shape =  (img_shape,img_shape,3), include_top = False, weights = 'imagenet',input_tensor=right_tensor)
    con = concatenate([left_model.output, right_model.output])
    GAP= GlobalAveragePooling2D()(con)
    out = Dense(8, activation = 'sigmoid')(GAP)
    model =Model(inputs=[left_input, right_input], outputs=out)

    return model

我找到了一个非常优雅的解决方案。我会逐步解释,因为可能与您的想法有点不同:

  1. 解码图像时,将两个图像堆叠在一个张量中,因此输入张量的形状为 [2, IMAGE_H, IMAGE_W, 3]
def decode_single(im_path, image_size):
    bits = tf.io.read_file(im_path)
    image = tf.image.decode_jpeg(bits, channels=3)
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.image.resize(image, image_size)
    return image

# Note that the image paths are packed in a tuple, and we unpack them inside the function
def decode(paths, label=None, image_size=(128, 128)):
    image_path1, image_path2 = paths
    im1 = decode_single(image_path1, image_size)
    im2 = decode_single(image_path2, image_size)
    images = tf.stack([im1, im2])

    if label is not None:
        return images, label

    return images
  1. 我声明了数据管道,因此路径打包在一个元组中。
label_ds = ...
ds = tf.data.Dataset.from_tensor_slices((left_paths, right_paths))
ds = tf.data.Dataset.zip((ds, label_ds)) # returns as ((im_path1, im_path2), label)) not (im_path1, im_path2, label)
ds = ds.map(decode).batch(4)
print(ds)
# Out: <BatchDataset shapes: ((None, 2, 128, 128, 3), ((None,),)), types: (tf.float32, (tf.int32,))>
  1. 因为我们要分批提供两张图片 (None, 2, 128, 128, 3)。使用形状 (2, HEIGHT, WIDTH, 3) 的单个输入声明模型,然后我们将输入拆分为两个图像:
def get_model():
    input_layer = Input(shape=(2, IMAGE_H,IMAGE_W,3))
    # Split into two images
    right_image, left_image = Lambda(lambda x: tf.split(x, 2, axis=1))(input_layer)
    
    right_image = Reshape([IMAGE_H, IMAGE_W, 3])(right_image)
    left_image = Reshape([IMAGE_H, IMAGE_W, 3])(left_image)
    # Replace by EfficientNets
    left_model =  Conv2D(64, 3)(left_image)
    right_model = Conv2D(64, 3)(right_image)
    con = Concatenate(-1)([left_model, right_model])
    GAP = GlobalAveragePooling2D()(con)
    out = Dense(8, activation = 'sigmoid')(GAP)
    model = tf.keras.Model(inputs=input_layer, outputs=out)

    return model
  1. 最后像往常一样编译和训练模型:
model = get_model()
model.compile(...)
model.fit(ds, epochs=10)