我希望max_pool_3d

Question

我如何扩展 theanos 下采样。max_pool_2d_same_size 以便不仅在特征图内而且在特征图之间 - 以有效的方式进行池化？

假设我有 3 个特征图，每个大小为 10x10，这将是一个 4D 张量 (1,3,10,10)。首先让 max pool ((2,2), no overlapping) 每个 (10,10) 特征图。结果是 3 个稀疏特征图，仍然是 (10,10) 但大多数值等于零：在 (2,2) window 中最多有一个值大于零。这就是下采样。max_pool_2d_same_size 所做的。

接下来，我想将某个 (2,2) window 的每个最大值与同一位置 window 的所有其他特征图的所有其他最大值进行比较。我只想保留所有特征图中的最大值。结果同样是 3 个特征图 (10,10)，几乎所有的值都是零。

有没有快速的方法？我不介意其他 max_pooling 函数，但出于 pooling/unpooling 目的，我需要最大值的确切位置（但这是另一个主题）。

Answer 1

我用 lasagne 和 cudnn 解决了这个问题。以下是如何获取最大池化操作（2d 和 3d）的索引的一些最小示例。参见 https://groups.google.com/forum/#!topic/lasagne-users/BhtKsRmFei4

import numpy as np
import theano
import theano.tensor as T
from theano.tensor.type import TensorType
from theano.configparser import config
import lasagne

def tensor5(name=None, dtype=None):
    if dtype is None:
        dtype = config.floatX
    type = TensorType(dtype, (False, False, False, False, False))
    return type(name)

def max_pooling_2d():
    input_var = T.tensor4('input')
    input_layer = lasagne.layers.InputLayer(shape=(None, 2, 4, 4), input_var=input_var)
    max_pool_layer = lasagne.layers.MaxPool2DLayer(input_layer, pool_size=(2, 2))

    pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
    indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
    get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)

    data = np.random.randint(low=0, high=9, size=32).reshape((1,2,4,4))
    indices = get_indices_fn(data)
    print data, "\n\n", indices

def max_pooling_3d():
    input_var = tensor5('input')
    input_layer = lasagne.layers.InputLayer(shape=(1, 1, 2, 4, 4), input_var=input_var)
    # 5 input dimensions: (batchsize, channels, 3 spatial dimensions)
    max_pool_layer = lasagne.layers.dnn.MaxPool3DDNNLayer(input_layer, pool_size=(2, 2, 2))

    pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
    indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
    get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)

    data = np.random.randint(low=0, high=9, size=32).reshape((1,1,2,4,4))
    indices = get_indices_fn(data)
    print data, "\n\n", indices

我希望max_pool_3d

Theano max_pool_3d

3d

pool

max

pooling

theano