使用 AlexNet 训练 MNIST
Train MNIST with AlexNet
我是 Caffe 初学者。我已经用 LeNet 完成了 MNIST 的训练,用 AlexNet 完成了 ImageNet 的训练,然后是教程,并获得了很好的结果。然后我尝试用 AlexNet 模型训练 MNIST。火车模型几乎与 models/bvlc_alexnet/train_val.prototxt
相同,但在某些地方发生了变化,例如:
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false` <--------------- set to false, and delete crop_size and mean_file
}
data_param {
source: "./mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
......
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false <-------- set to false, and delete crop_size and mean_file
}
data_param {
source: "./mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
......
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size:3 <-------------------- changed to 3
stride: 2 <-------------------- changed to 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
......
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 10 <-------------------- changed to 10
` weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
而solver.prototxt如下
net: "./train_val.prototxt"
test_iter: 1000
test_interval: 100
base_lr: 0.01
lr_policy: "inv"
power: 0.75
gamma: 0.1
stepsize: 1000
display: 100
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "./caffe_alexnet_train"
solver_mode: GPU
经过10万次迭代训练,准确率达到0.97左右
I0315 19:28:54.827383 26505 solver.cpp:258] Train net output #0: loss = 0.0331752 (* 1 = 0.0331752 loss)`
`......`
I0315 19:28:56.384718 26505 solver.cpp:351] Iteration 100000, Testing net (#0)
I0315 19:28:58.121800 26505 solver.cpp:418] Test net output #0: accuracy = 0.974875
I0315 19:28:58.121834 26505 solver.cpp:418] Test net output #1: loss = 0.0804802 (* 1 = 0.0804802 loss)
然后我用python脚本预测了测试集中的单张图片
import os
import sys
import numpy as np
import matplotlib.pyplot as plt
import caffe
caffe_root = '/home/ubuntu/pkg/local/caffe'
sys.path.insert(0, caffe_root + 'python')
MODEL_FILE = './deploy.prototxt'
PRETRAINED = './caffe_alexnet_train_iter_100000.caffemodel'
IMAGE_FILE = './4307.png'
input_image = caffe.io.load_image(IMAGE_FILE, color=False)
net = caffe.Classifier(MODEL_FILE, PRETRAINED)
prediction = net.predict([input_image], oversample = False)
caffe.set_mode_cpu()
print( 'predicted class: ', prediction[0].argmax() )
print( 'predicted class all: ', prediction[0] )
但是预测错了。 (这个脚本在 MNIST 和 LeNet 上的预测很好)
并且每个class的概率也是奇数
predicted class: 9 <------------- the correct label is 5
predicted class all: [0.01998338 0.14941786 0.09392905 0.07361069 0.07640345 0.10996494 0.03646726 0.12371133 0.15246753 0.16404454]
** deploy.prototxt 与 models/bvlc_alexnet/deploy.prototxt
几乎相同,但在 train_val.prototxt
中更改了相同的地方
有什么建议吗?
AlexNet 旨在区分 1000 个 类,对 1.3M 输入图像进行训练,每个输入图像(规范地)256x256x3 数据值。您基本上使用相同的工具来处理 10 类 和 28x28x1 输入。
很简单,您是 over-fitting 设计。
如果您想使用一般的 AlexNet 设计来处理 far-simpler 作业,您需要适当地缩小它的规模。需要进行一些实验才能找到 "appropriately" 的可行定义:通过某种因素缩小 conv 层,添加 drop-out,完全删除一个 conv inception,...
我是 Caffe 初学者。我已经用 LeNet 完成了 MNIST 的训练,用 AlexNet 完成了 ImageNet 的训练,然后是教程,并获得了很好的结果。然后我尝试用 AlexNet 模型训练 MNIST。火车模型几乎与 models/bvlc_alexnet/train_val.prototxt
相同,但在某些地方发生了变化,例如:
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false` <--------------- set to false, and delete crop_size and mean_file
}
data_param {
source: "./mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
......
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false <-------- set to false, and delete crop_size and mean_file
}
data_param {
source: "./mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
......
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size:3 <-------------------- changed to 3
stride: 2 <-------------------- changed to 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
......
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 10 <-------------------- changed to 10
` weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
而solver.prototxt如下
net: "./train_val.prototxt"
test_iter: 1000
test_interval: 100
base_lr: 0.01
lr_policy: "inv"
power: 0.75
gamma: 0.1
stepsize: 1000
display: 100
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "./caffe_alexnet_train"
solver_mode: GPU
经过10万次迭代训练,准确率达到0.97左右
I0315 19:28:54.827383 26505 solver.cpp:258] Train net output #0: loss = 0.0331752 (* 1 = 0.0331752 loss)`
`......`
I0315 19:28:56.384718 26505 solver.cpp:351] Iteration 100000, Testing net (#0)
I0315 19:28:58.121800 26505 solver.cpp:418] Test net output #0: accuracy = 0.974875
I0315 19:28:58.121834 26505 solver.cpp:418] Test net output #1: loss = 0.0804802 (* 1 = 0.0804802 loss)
然后我用python脚本预测了测试集中的单张图片
import os
import sys
import numpy as np
import matplotlib.pyplot as plt
import caffe
caffe_root = '/home/ubuntu/pkg/local/caffe'
sys.path.insert(0, caffe_root + 'python')
MODEL_FILE = './deploy.prototxt'
PRETRAINED = './caffe_alexnet_train_iter_100000.caffemodel'
IMAGE_FILE = './4307.png'
input_image = caffe.io.load_image(IMAGE_FILE, color=False)
net = caffe.Classifier(MODEL_FILE, PRETRAINED)
prediction = net.predict([input_image], oversample = False)
caffe.set_mode_cpu()
print( 'predicted class: ', prediction[0].argmax() )
print( 'predicted class all: ', prediction[0] )
但是预测错了。 (这个脚本在 MNIST 和 LeNet 上的预测很好) 并且每个class的概率也是奇数
predicted class: 9 <------------- the correct label is 5
predicted class all: [0.01998338 0.14941786 0.09392905 0.07361069 0.07640345 0.10996494 0.03646726 0.12371133 0.15246753 0.16404454]
** deploy.prototxt 与 models/bvlc_alexnet/deploy.prototxt
几乎相同,但在 train_val.prototxt
有什么建议吗?
AlexNet 旨在区分 1000 个 类,对 1.3M 输入图像进行训练,每个输入图像(规范地)256x256x3 数据值。您基本上使用相同的工具来处理 10 类 和 28x28x1 输入。
很简单,您是 over-fitting 设计。
如果您想使用一般的 AlexNet 设计来处理 far-simpler 作业,您需要适当地缩小它的规模。需要进行一些实验才能找到 "appropriately" 的可行定义:通过某种因素缩小 conv 层,添加 drop-out,完全删除一个 conv inception,...