Tensorflow 中的独立图像块提取操作

Question

在Tensorflow docs中，tf.nn.conv2d操作描述为：

将滤波器展平为形状为 [filter_height * filter_width * in_channels, output_channels] 的二维矩阵。
从输入张量中提取图像块以形成形状为 [batch, out_height, out_width, filter_height * filter_width * in_channels].
对于每个补丁，右乘滤波器矩阵和图像补丁向量。

是否有应用只是第2步的操作？我在 API 文档中找不到类似的内容。我可能在搜索错误的关键字。

Answer 1

我想这样做的一个技巧是：

使用 [filter_height, filter_width, in_channels, output_channels] 和 output_channels = filter_height * filter_width * in_channels
以将过滤器展平为二维矩阵（参见您的步骤 2） 的方式固定此过滤器的值是单位矩阵。查看下面的示例代码，了解使用 np.eye().reshape()
正常执行tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

您现在的输出形状为 [batch, out_height, out_width, filter_height * filter_width * in_channels]

这是一个简单的代码，用于大小为 3*3、具有 1 个通道（批量大小为 1）的输入图像。

import tensorflow as tf
import numpy as np

input_value = np.arange(1, 10).reshape((1, 3, 3, 1))
input = tf.constant(input_value)
input = tf.cast(input, tf.float32)

filter_value = np.eye(9).reshape((3, 3, 1, 9))
filter = tf.constant(filter_value)
filter = tf.cast(filter, tf.float32)

output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

Answer 2

现在已添加到 tensorflow api：https://www.tensorflow.org/versions/r0.9/api_docs/python/array_ops.html#extract_image_patches

Tensorflow 中的独立图像块提取操作

Standalone image patch extraction op in Tensorflow

convolution

neural-network

tensorflow