为什么我不能用 1x1 卷积层训练 overfeat 网络

Why can't I train overfeat network with convolution layer 1x1

我正在尝试修改 tensorflow slim overfeat 网络以对小图像进行分类 类,图像大小为 60*60 和 3 类。 我在 Ubuntu 14.04 和 TITAN X GPU 上使用 tensorflow v0.12。

我的第一个网络是



    import tensorflow as tf

    slim = tf.contrib.slim
    trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)


    def overfeat_arg_scope(weight_decay=0.0005):
      with slim.arg_scope(
          [slim.conv2d, slim.fully_connected],
          activation_fn=tf.nn.relu,
          weights_regularizer=slim.l2_regularizer(weight_decay),
          biases_initializer=tf.constant_initializer()):
        with slim.arg_scope([slim.conv2d], padding='SAME'):
          with slim.arg_scope([slim.max_pool2d], padding='VALID') as arg_sc:
            return arg_sc


    def overfeat(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.5,
                 spatial_squeeze=False,
                 reuse=None,
                 scope='overfeat'):
      with tf.variable_scope(scope, 'overfeat', [inputs], reuse=reuse) as sc:
        end_points_collection = sc.name + '_end_points'
        # Collect outputs for conv2d, fully_connected and max_pool2d
        with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d],
                            outputs_collections=end_points_collection):
          net = slim.conv2d(inputs, 64, 3, padding='VALID',
                            scope='conv11')
          net = slim.conv2d(inputs, 128, 3, padding='VALID',
                            scope='conv12')
          net = slim.max_pool2d(net, 2, scope='pool1')
          net = slim.conv2d(net, 128, 3, padding='VALID', scope='conv2')
          net = slim.max_pool2d(net, 2, scope='pool2')
          net = slim.conv2d(net, 256, 3, scope='conv3')
          net = slim.conv2d(net, 256, 3, scope='conv4')
          net = slim.conv2d(net, 256, 3, scope='conv5')
          net = slim.max_pool2d(net, 2, scope='pool5')
          with slim.arg_scope([slim.conv2d],
                              weights_initializer=trunc_normal(0.005),
                              biases_initializer=tf.constant_initializer(0.1)):
            # Use conv2d instead of fully_connected layers.
            net = slim.conv2d(net, 512, 3, padding='VALID', scope='fc6')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                               scope='dropout6')
            net = slim.conv2d(net, 1024, 1, scope='fc7')
            with tf.variable_scope('Logits'):
                #pylint: disable=no-member
                if is_training:
                    net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID',
                                      scope='AvgPool_1a_8x8')
                net = slim.conv2d(
                    net,
                    num_classes, 1,
                    activation_fn=None,
                    normalizer_fn=None,
                    biases_initializer=tf.constant_initializer(),
                    scope='fc9')

                net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                                   scope='Dropout')
          # Convert end_points_collection into a end_point dict.
          end_points = slim.utils.convert_collection_to_dict(end_points_collection)
          if spatial_squeeze:
            net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
            end_points[sc.name + '/fc8'] = net
          return net, end_points

    def inference(images, num_classes, keep_probability, phase_train=True, weight_decay=0.0, reuse=None):
        batch_norm_params = {
            # Decay for the moving averages.
            'decay': 0.995,
            # epsilon to prevent 0s in variance.
            'epsilon': 0.001,
            # force in-place updates of mean and variance estimates
            'updates_collections': None,
        }
        with slim.arg_scope(overfeat_arg_scope()):
            return overfeat(images, num_classes, is_training=phase_train,
                  dropout_keep_prob=keep_probability, reuse=reuse)

我在 tf.nn.sparse_softmax_cross_entropy_with_logits 函数中使用交叉熵损失。

火车结果是 Loss And Accuracy with one 1x Conv

这个成绩还算过得去。 我正在尝试在 fc7 之后添加一个 1x1 conv,因为我认为 1x1 conv 是相同的全连接层,可能会提高准确性。


        ...
        net = slim.conv2d(net, 1024, 1, scope='fc7')
        net = slim.conv2d(net, 1024, 1, scope='fc7_1')
        ...

但是我得到了不可靠的结果: Loss And Accuracy with two 1x1 Conv

此网络未使用损失 1 进行优化。

为什么我不能添加更多的 1x1 conv 或 fc 层?

我该如何改进这个网络?

1x1卷积跟全连接层不一样。仅参数计数就大不相同。 1x1 卷积是所有先前滤波器输出中位于图像中相同位置的所有像素的加权和。

另一方面,全连接层会将所有过滤器的所有像素都考虑到当前层中的每个新像素。

你应该看看 http://deeplearning.net/tutorial/contents.html 以更好地理解卷积。

对于最后一层,全连接层用于组合前面层中提取的特征,并将它们组合到最终输出中。

(1,1) 卷积层不是全连接层。如果你想将全连接层实现为卷积层,你应该添加之前层的最后一层内核大小。

(如果前一层的特征图是 50x50,那么最后一层的内核应该是 50 x 50)。具有 (1,1) 内核大小的卷积层类似于 ro mlp 层。如果你想了解更多它是如何工作的,请阅读这篇论文 Network in Network

如果我没看错的话,你想搭上全连接层。所以你必须做两件事:

  • 确保通过使用通道等于输出 classes 的卷积层 (1,1) 将最后一层缩小到 class 的大小。
  • 使用global average pooling将feature map缩减为1,然后将结果提供给softmax。