使用 AlexNet 训练 MNIST

Question

我是 Caffe 初学者。我已经用 LeNet 完成了 MNIST 的训练，用 AlexNet 完成了 ImageNet 的训练，然后是教程，并获得了很好的结果。然后我尝试用 AlexNet 模型训练 MNIST。火车模型几乎与 models/bvlc_alexnet/train_val.prototxt 相同，但在某些地方发生了变化，例如：

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
 }
  transform_param {
    mirror: false` <--------------- set to false,  and delete crop_size and  mean_file 

  }
  data_param {
    source: "./mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}

......

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
 }
  transform_param {
    mirror: false  <-------- set to false,  and delete crop_size and  mean_file         
 }
  data_param {
    source: "./mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}

......

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size:3    <--------------------  changed to 3
    stride: 2            <--------------------  changed to 2
   weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
     value: 0
    }
  }
}

......

layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 10      <--------------------  changed to 10

 `   weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

而solver.prototxt如下

net: "./train_val.prototxt"
test_iter: 1000
test_interval: 100
base_lr: 0.01
lr_policy: "inv"
power: 0.75
gamma: 0.1
stepsize: 1000
display: 100
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "./caffe_alexnet_train"
solver_mode: GPU

经过10万次迭代训练，准确率达到0.97左右

I0315 19:28:54.827383 26505 solver.cpp:258]     Train net output #0: loss = 0.0331752 (* 1 = 0.0331752 loss)`

`......`

I0315 19:28:56.384718 26505 solver.cpp:351] Iteration 100000, Testing net (#0)
I0315 19:28:58.121800 26505 solver.cpp:418]     Test net output #0: accuracy = 0.974875
I0315 19:28:58.121834 26505 solver.cpp:418]     Test net output #1: loss = 0.0804802 (* 1 = 0.0804802 loss)

然后我用python脚本预测了测试集中的单张图片

import os
import sys
import numpy as np
import matplotlib.pyplot as plt
import caffe

caffe_root = '/home/ubuntu/pkg/local/caffe'

sys.path.insert(0, caffe_root + 'python')
MODEL_FILE = './deploy.prototxt' 

PRETRAINED = './caffe_alexnet_train_iter_100000.caffemodel'

IMAGE_FILE = './4307.png'

input_image = caffe.io.load_image(IMAGE_FILE, color=False)

net = caffe.Classifier(MODEL_FILE, PRETRAINED)

prediction = net.predict([input_image], oversample = False)

caffe.set_mode_cpu()

print( 'predicted class: ', prediction[0].argmax() )
print( 'predicted class all: ', prediction[0] )

但是预测错了。（这个脚本在 MNIST 和 LeNet 上的预测很好） 并且每个class的概率也是奇数

predicted class:  9   <------------- the correct label is 5

predicted class all:  [0.01998338 0.14941786 0.09392905 0.07361069 0.07640345 0.10996494 0.03646726 0.12371133 0.15246753 0.16404454]

** deploy.prototxt 与 models/bvlc_alexnet/deploy.prototxt 几乎相同，但在 train_val.prototxt

中更改了相同的地方

有什么建议吗？

Answer 1

AlexNet 旨在区分 1000 个类，对 1.3M 输入图像进行训练，每个输入图像（规范地）256x256x3 数据值。您基本上使用相同的工具来处理 10 类和 28x28x1 输入。

很简单，您是 over-fitting 设计。

如果您想使用一般的 AlexNet 设计来处理 far-simpler 作业，您需要适当地缩小它的规模。需要进行一些实验才能找到 "appropriately" 的可行定义：通过某种因素缩小 conv 层，添加 drop-out，完全删除一个 conv inception，...

使用 AlexNet 训练 MNIST

Train MNIST with AlexNet

mnist

caffe