Tensorflow:小批量中每个样本的不同过滤器的卷积

Tensorflow: Convolutions with different filter for each sample in the mini-batch

我想要一个带有过滤器的二维卷积,它取决于 tensorflow 中小批量中的样本。任何想法如何做到这一点,特别是如果每​​个小批量的样本数量未知?

具体来说,我有 MB x H x W x Channels 形式的输入数据 inp,我有 MB x fh x fw x Channels x OutChannels.

形式的过滤器 F

假设

inp = tf.placeholder('float', [None, H, W, channels_img], name='img_input').

我想做 tf.nn.conv2d(inp, F, strides = [1,1,1,1]),但这是不允许的,因为 F 不能有小批量维度。知道如何解决这个问题吗?

他们绕过它的方法是使用

添加一个额外的维度
tf.expand_dims(inp, 0)

创建 'fake' 批量大小。然后使用

tf.nn.conv3d()

filter-depth 与批量大小匹配的操作。这将导致每个过滤器在每批次中仅与一个样本进行卷积。

遗憾的是,您无法通过这种方式解决可变批量大小问题,只能解决卷积问题。

我认为提议的技巧实际上是不正确的。 tf.conv3d() 层发生的情况是输入在深度(=实际批次)维度上进行卷积,然后沿着生成的特征图求和。使用 padding='SAME',结果输出的数量恰好与批量大小相同,所以有人被愚弄了!

编辑:我认为用不同的过滤器对不同的小批量元素进行卷积的一种可能方法涉及 'hacking' 深度卷积。假设批量大小 MB 已知:

inp = tf.placeholder(tf.float32, [MB, H, W, channels_img])

# F has shape (MB, fh, fw, channels, out_channels)
# REM: with the notation in the question, we need: channels_img==channels

F = tf.transpose(F, [1, 2, 0, 3, 4])
F = tf.reshape(F, [fh, fw, channels*MB, out_channels)

inp_r = tf.transpose(inp, [1, 2, 0, 3]) # shape (H, W, MB, channels_img)
inp_r = tf.reshape(inp, [1, H, W, MB*channels_img])

out = tf.nn.depthwise_conv2d(
          inp_r,
          filter=F,
          strides=[1, 1, 1, 1],
          padding='VALID') # here no requirement about padding being 'VALID', use whatever you want. 
# Now out shape is (1, H, W, MB*channels*out_channels)

out = tf.reshape(out, [H, W, MB, channels, out_channels) # careful about the order of depthwise conv out_channels!
out = tf.transpose(out, [2, 0, 1, 3, 4])
out = tf.reduce_sum(out, axis=3)

# out shape is now (MB, H, W, out_channels)

如果 MB 未知,应该可以使用 tf.shape() 动态确定它(我认为)

您可以按如下方式使用 tf.map_fn

inp = tf.placeholder(tf.float32, [None, h, w, c_in]) 
def single_conv(tupl):
    x, kernel = tupl
    return tf.nn.conv2d(x, kernel, strides=(1, 1, 1, 1), padding='VALID')
# Assume kernels shape is [tf.shape(inp)[0], fh, fw, c_in, c_out]
batch_wise_conv = tf.squeeze(tf.map_fn(
    single_conv, (tf.expand_dims(inp, 1), kernels), dtype=tf.float32),
    axis=1
)

map_fn 指定 dtype 很重要。基本上,这个解决方案定义了 batch_dim_size 二维卷积运算。

接受的答案在处理维度方面略有错误,因为它们通过 padding = "VALID" 更改(他将它们视为 padding = "SAME")。因此在一般情况下,由于这种不匹配,代码会崩溃。我附上了他更正的代码,两种情况都得到了正确处理。

inp = tf.placeholder(tf.float32, [MB, H, W, channels_img])

# F has shape (MB, fh, fw, channels, out_channels)
# REM: with the notation in the question, we need: channels_img==channels

F = tf.transpose(F, [1, 2, 0, 3, 4])
F = tf.reshape(F, [fh, fw, channels*MB, out_channels)

inp_r = tf.transpose(inp, [1, 2, 0, 3]) # shape (H, W, MB, channels_img)
inp_r = tf.reshape(inp_r, [1, H, W, MB*channels_img])

padding = "VALID" #or "SAME"
out = tf.nn.depthwise_conv2d(
          inp_r,
          filter=F,
          strides=[1, 1, 1, 1],
          padding=padding) # here no requirement about padding being 'VALID', use whatever you want. 
# Now out shape is (1, H-fh+1, W-fw+1, MB*channels*out_channels), because we used "VALID"

if padding == "SAME":
    out = tf.reshape(out, [H, W, MB, channels, out_channels)
if padding == "VALID":
    out = tf.reshape(out, [H-fh+1, W-fw+1, MB, channels, out_channels)
out = tf.transpose(out, [2, 0, 1, 3, 4])
out = tf.reduce_sum(out, axis=3)

# out shape is now (MB, H-fh+1, W-fw+1, out_channels)