为什么数据增强不会提高 CNN 中的纹理分类精度？

Question

目前，我正在使用Alexnet做分类任务。

每个输入样本的大小是480*680这样的：

使用普通网络，输入大小为 256*256（在预处理步骤中生成）的裁剪输入，批量大小为 8，准确率为 92%。

但是，当我尝试使用以下裁剪层为每个 (480*680) 样本（角加中心裁剪）生成 5 裁剪时：

# this is the reference blob of the cropping process which determines cropping size
layer {
  name: "reference-blob"
  type: "Input"
  top: "reference"
  input_param { shape: { dim: 8 dim: 3 dim: 227 dim: 227 } }
}
# upper-left crop
layer{
  name: "crop-1"
  type: "Crop"
  bottom: "data"
  bottom: "reference"
  top: "crop-1"
  crop_param {
      axis: 2
      offset: 1
      offset: 1
    }
}
# upper-right crop
layer{
    name: "crop-2"
    type: "Crop"
    bottom: "data"
    bottom: "reference"
    top: "crop-2"
    crop_param {
        axis: 2
        offset: 1
        offset: 412
    }
}
# lower-left crop
layer{
    name: "crop-3"
    type: "Crop"
    bottom: "data"
    bottom: "reference"
    top: "crop-3"
    crop_param {
        axis: 2
        offset: 252
        offset: 1
    }
}
# lower-right crop
layer{
    name: "crop-4"
    type: "Crop"
    bottom: "data"
    bottom: "reference"
    top: "crop-4"
    crop_param {
        axis: 2
        offset: 252
        offset: 412
    }
}
# center crop
layer{
    name: "crop-5"
    type: "Crop"
    bottom: "data"
    bottom: "reference"
    top: "crop-5"
    crop_param {
        axis: 2
        offset: 127
        offset: 207
    }
}
# concat all the crop results to feed the next layer
layer{
    name: "crop_concat"
    type: "Concat"
    bottom: "crop-1"
    bottom: "crop-2"
    bottom: "crop-3"
    bottom: "crop-4"
    bottom: "crop-5"
    top: "all_crops"
    concat_param {
            axis: 0
    }
}
# generating enough labels for all the crop results
layer{
    name: "label_concat"
    type: "Concat"
    bottom: "label"
    bottom: "label"
    bottom: "label"
    bottom: "label"
    bottom: "label"
    top: "all-labels"
    concat_param {
            axis: 0
    }
}

这导致了 90.6% 的准确率，这很奇怪。

有什么想法吗？

Answer 1

裁剪版本的典型用途是在识别过滤器的规范位置获取关键特征。例如，典型的 5-crop 方法经常发现 "animal face near the middle of the image" 足以使它作为学习图标从最后 2-4 层出现。

由于纹理往往会重复某些品质，因此裁剪照片没有这样的优势：您呈现 5 个较小的纹理实例，具有相对较大的纹理，而不是完整图像。

为什么数据增强不会提高 CNN 中的纹理分类精度？

Why data augmentation does not improve texture classification accuracy in CNNs?

neural-network

deep-learning

caffe

conv-neural-network