TensorFlow:反池化

TensorFlow: Unpooling

是否有 TensorFlow 原生函数可以对反卷积网络进行反池化?

我是用普通的python写的,但是想把它翻译成TensorFlow就变得很复杂,因为它的对象目前甚至不支持项目分配,我认为这是一个很大的不便与 TF.

我认为还没有官方的反池化层,这令人沮丧,因为您必须使用图像调整大小(双线性插值或最近邻),这就像一个平均的反池化操作,而且速度非常慢。看看'image'节中的tfapi,你就会发现。

Tensorflow 有一个 maxpooling_with_argmax 东西,你可以在其中获得最大池化输出以及激活图,这很好,因为你可以在反池化层中使用它来保留 'lost' 空间信息,但它似乎没有这样的 unpooling 操作可以做到这一点。我猜他们正计划添加它......很快。

编辑:一周前我在 google 上发现有人似乎已经实现了类似的东西,但我个人还没有尝试过。 https://github.com/ppwwyyxx/tensorpack/blob/master/tensorpack/models/pool.py#L66

我正在寻找 maxunpooling 操作并尝试实现它。当我在 CUDA 上苦苦挣扎时,我想到了某种 hacky implementation for the gradient

代码为 here,您需要使用 GPU 支持从源代码构建它。 下面是一个演示应用程序。但是没有保证!

此操作还存在 open issue

import tensorflow as tf
import numpy as np

def max_pool(inp, k=2):
    return tf.nn.max_pool_with_argmax_and_mask(inp, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")

def max_unpool(inp, argmax, argmax_mask, k=2):
    return tf.nn.max_unpool(inp, argmax, argmax_mask, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding="SAME")

def conv2d(inp, name):
    w = weights[name]
    b = biases[name]
    var = tf.nn.conv2d(inp, w, [1, 1, 1, 1], padding='SAME')
    var = tf.nn.bias_add(var, b)
    var = tf.nn.relu(var)
    return var

def conv2d_transpose(inp, name, dropout_prob):
    w = weights[name]
    b = biases[name]

    dims = inp.get_shape().dims[:3]
    dims.append(w.get_shape()[-2]) # adpot channels from weights (weight definition for deconv has switched input and output channel!)
    out_shape = tf.TensorShape(dims)

    var = tf.nn.conv2d_transpose(inp, w, out_shape, strides=[1, 1, 1, 1], padding="SAME")
    var = tf.nn.bias_add(var, b)
    if not dropout_prob is None:
        var = tf.nn.relu(var)
        var = tf.nn.dropout(var, dropout_prob)
    return var


weights = {
    "conv1":    tf.Variable(tf.random_normal([3, 3,  3, 16])),
    "conv2":    tf.Variable(tf.random_normal([3, 3, 16, 32])),
    "conv3":    tf.Variable(tf.random_normal([3, 3, 32, 32])),
    "deconv2":  tf.Variable(tf.random_normal([3, 3, 16, 32])),
    "deconv1":  tf.Variable(tf.random_normal([3, 3,  1, 16])) }

biases = {
    "conv1":    tf.Variable(tf.random_normal([16])),
    "conv2":    tf.Variable(tf.random_normal([32])),
    "conv3":    tf.Variable(tf.random_normal([32])),
    "deconv2":  tf.Variable(tf.random_normal([16])),
    "deconv1":  tf.Variable(tf.random_normal([ 1])) }


## Build Miniature CEDN
x = tf.placeholder(tf.float32, [12, 20, 20, 3])
y = tf.placeholder(tf.float32, [12, 20, 20, 1])
p = tf.placeholder(tf.float32)

conv1                                   = conv2d(x, "conv1")
maxp1, maxp1_argmax, maxp1_argmax_mask  = max_pool(conv1)

conv2                                   = conv2d(maxp1, "conv2")
maxp2, maxp2_argmax, maxp2_argmax_mask  = max_pool(conv2)

conv3                                   = conv2d(maxp2, "conv3")

maxup2                                  = max_unpool(conv3, maxp2_argmax, maxp2_argmax_mask)
deconv2                                 = conv2d_transpose(maxup2, "deconv2", p)

maxup1                                  = max_unpool(deconv2, maxp1_argmax, maxp1_argmax_mask)
deconv1                                 = conv2d_transpose(maxup1, "deconv1", None)


## Optimizing Stuff
loss        = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(deconv1, y))
optimizer   = tf.train.AdamOptimizer(learning_rate=1).minimize(loss)


## Test Data
np.random.seed(123)
batch_x = np.where(np.random.rand(12, 20, 20, 3) > 0.5, 1.0, -1.0)
batch_y = np.where(np.random.rand(12, 20, 20, 1) > 0.5, 1.0,  0.0)
prob    = 0.5


with tf.Session() as session:
    tf.set_random_seed(123)
    session.run(tf.initialize_all_variables())

    print "\n\n"
    for i in range(10):
        session.run(optimizer, feed_dict={x: batch_x, y: batch_y, p: prob})
        print "step", i + 1
        print "loss",  session.run(loss, feed_dict={x: batch_x, y: batch_y, p: 1.0}), "\n\n"

编辑 29.11.17

前段时间,我针对 TensorFlow 1.0 以干净的方式重新实现了它,前向操作也可作为 CPU 版本使用。你可以找到它in this branch,如果你想使用它,我建议你查找最后几次提交。

这里有几个 tensorflow 实现 pooling.py

即:

1) unpool 操作 (source) 利用 tf.nn.max_pool_with_argmax 的输出。尽管请注意,从 tensorflow 1.0 开始 tf.nn.max_pool_with_argmax 仅支持 GPU

2) 通过用零或最大元素的副本填充未池化区域的位置来模拟最大池化的逆运算的上采样操作。 与 tensorpack 相比,它允许复制元素而不是零,并支持 [2, 2] 以外的步长。

无需重新编译,反向支持友好。

插图:

我检查了 this which shagas mentioned ,它正在运行。

x = [[[[1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3]],
  [[1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3]],
[[1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3],
  [1, 1, 2,2, 3, 3]]]]

x = np.array(x)

inp = tf.convert_to_tensor(x)

out = UnPooling2x2ZeroFilled(inp)

out
Out[19]: 
<tf.Tensor: id=36, shape=(1, 6, 12, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0]],

        [[0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0]],

        [[1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0]],

        [[0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0]],

        [[1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0],
         [1, 1, 2, 2, 3, 3],
         [0, 0, 0, 0, 0, 0]],

        [[0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0]]]])>


out1 = tf.keras.layers.MaxPool2D()(out)

out1
Out[37]: 
<tf.Tensor: id=118, shape=(1, 3, 6, 6), dtype=int64, numpy=
array([[[[1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3]],

        [[1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3]],

        [[1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3],
         [1, 1, 2, 2, 3, 3]]]])>

如果你需要最大unpooling那么你可以使用(虽然我没有检查)this one

这是我的实现。您应该使用 tf.nn.max_pool_with_argmax 应用最大池化,然后传递 tf.nn.max_pool_with_argmax

argmax 结果
def unpooling(inputs, output_shape, argmax):
        """
        Performs unpooling, as explained in:
        https://www.oreilly.com/library/view/hands-on-convolutional-neural/9781789130331/6476c4d5-19f2-455f-8590-c6f99504b7a5.xhtml
        :param inputs: Input Tensor.
        :param output_shape: Desired output shape. For example, on 2D unpooling, this should be 4D (because of number of samples and channels).
        :param argmax: Result argmax from tf.nn.max_pool_with_argmax
            https://www.tensorflow.org/api_docs/python/tf/nn/max_pool_with_argmax
        """
        flat_output_shape = tf.cast(tf.reduce_prod(output_shape), tf.int64)

        updates = tf.reshape(inputs, [-1])
        indices = tf.expand_dims(tf.reshape(argmax, [-1]), axis=-1)

        ret = tf.scatter_nd(indices, updates, shape=[flat_output_shape])
        ret = tf.reshape(ret, output_shape)
        return ret

这有一个小的bug/feature,那就是如果 argmax 有一个重复的值,它将执行一个加法,而不是只把值放一次。如果步幅为 1,请注意这一点。但是,我不知道是否需要这样做。

现在有一个 Tensorflow Addon MaxUnpooling2D:

Unpool the outputs of a maximum pooling operation.

tfa.layers.MaxUnpooling2D(
    pool_size: Union[int, Iterable[int]] = (2, 2),
    strides: Union[int, Iterable[int]] = (2, 2),
    padding: str = 'SAME',
    **kwargs
)

这个class可以例如用作

import tensorflow as tf
import tensorflow_addons as tfa

pooling, max_index = tf.nn.max_pool_with_argmax(input, 2, 2, padding='SAME')
unpooling = tfa.layers.MaxUnpooling2D()(pooling, max_index)