Caffe - 多class和多标签图像class化
Caffe - multi-class and multi-label image classification
我正在尝试在 caffe 中创建单个多class 和多标签网络配置。
假设class狗的化:
狗是小的还是大的? (class)
它是什么颜色? (class)
它有衣领吗? (标签)
这东西可以用caffe吗?
这样做的正确方法是什么?
构建 lmdb 文件的正确方法是什么?
关于多标签class化的所有出版物都是2015年左右的,从那以后这个主题发生了一些变化?
谢谢。
Caffe 的 LMDB 接口的问题是它只允许 。
如果您希望每个图像有多个标签,则必须使用不同的输入层。
我建议使用 "HDF5Data"
图层:
这允许更灵活地设置输入数据,您可以为该层设置任意数量的 "top"
。每个输入图像可能有多个标签,并且您的网络有多个损失可供训练。
请参阅了解如何为 caffe 创建 hdf5 数据。
感谢,
只是想了解实用的方法..
创建包含所有图像标签的 2 个 .text 文件(一个用于训练,一个用于验证)后,例如:
/train/img/1.png 0 4 18
/train/img/2.png 1 7 17 33
/train/img/3.png 0 4 17
运行 py脚本:
import h5py, os
import caffe
import numpy as np
SIZE = 227 # fixed size to all images
with open( 'train.txt', 'r' ) as T :
lines = T.readlines()
# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' )
y = np.zeros( (len(lines),1), dtype='f4' )
for i,l in enumerate(lines):
sp = l.split(' ')
img = caffe.io.load_image( sp[0] )
img = caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size
# you may apply other input transformations here...
# Note that the transformation should take img from size-by-size-by-3 and transpose it to 3-by-size-by-size
# for example
transposed_img = img.transpose((2,0,1))[::-1,:,:] # RGB->BGR
X[i] = transposed_img
y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
L.write( 'train.h5' ) # list all h5 files you are going to use
并创建 train.h5 和 val.h5(X 数据集包含图像,Y 数据集包含标签吗?)。
替换我的网络输入层:
layers {
name: "data"
type: DATA
top: "data"
top: "label"
data_param {
source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/train_db"
backend: LMDB
batch_size: 64
}
transform_param {
crop_size: 227
mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto"
mirror: true
}
include: { phase: TRAIN }
}
layers {
name: "data"
type: DATA
top: "data"
top: "label"
data_param {
source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/val_db"
backend: LMDB
batch_size: 64
}
transform_param {
crop_size: 227
mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto"
mirror: true
}
include: { phase: TEST }
}
到
layer {
type: "HDF5Data"
top: "X" # same name as given in create_dataset!
top: "y"
hdf5_data_param {
source: "train_h5_list.txt" # do not give the h5 files directly, but the list.
batch_size: 32
}
include { phase:TRAIN }
}
layer {
type: "HDF5Data"
top: "X" # same name as given in create_dataset!
top: "y"
hdf5_data_param {
source: "val_h5_list.txt" # do not give the h5 files directly, but the list.
batch_size: 32
}
include { phase:TEST }
}
我猜 HDF5 不需要 mean.binaryproto?
接下来,输出层应该如何变化才能输出多个标签概率?
我想我需要交叉熵层而不是 softmax?
这是当前的输出层:
layers {
bottom: "prob"
bottom: "label"
top: "loss"
name: "loss"
type: SOFTMAX_LOSS
loss_weight: 1
}
layers {
name: "accuracy"
type: ACCURACY
bottom: "prob"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
我正在尝试在 caffe 中创建单个多class 和多标签网络配置。
假设class狗的化: 狗是小的还是大的? (class) 它是什么颜色? (class) 它有衣领吗? (标签)
这东西可以用caffe吗? 这样做的正确方法是什么? 构建 lmdb 文件的正确方法是什么?
关于多标签class化的所有出版物都是2015年左右的,从那以后这个主题发生了一些变化?
谢谢。
Caffe 的 LMDB 接口的问题是它只允许
如果您希望每个图像有多个标签,则必须使用不同的输入层。
我建议使用 "HDF5Data"
图层:
这允许更灵活地设置输入数据,您可以为该层设置任意数量的 "top"
。每个输入图像可能有多个标签,并且您的网络有多个损失可供训练。
请参阅
感谢
只是想了解实用的方法.. 创建包含所有图像标签的 2 个 .text 文件(一个用于训练,一个用于验证)后,例如:
/train/img/1.png 0 4 18
/train/img/2.png 1 7 17 33
/train/img/3.png 0 4 17
运行 py脚本:
import h5py, os
import caffe
import numpy as np
SIZE = 227 # fixed size to all images
with open( 'train.txt', 'r' ) as T :
lines = T.readlines()
# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' )
y = np.zeros( (len(lines),1), dtype='f4' )
for i,l in enumerate(lines):
sp = l.split(' ')
img = caffe.io.load_image( sp[0] )
img = caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size
# you may apply other input transformations here...
# Note that the transformation should take img from size-by-size-by-3 and transpose it to 3-by-size-by-size
# for example
transposed_img = img.transpose((2,0,1))[::-1,:,:] # RGB->BGR
X[i] = transposed_img
y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
L.write( 'train.h5' ) # list all h5 files you are going to use
并创建 train.h5 和 val.h5(X 数据集包含图像,Y 数据集包含标签吗?)。
替换我的网络输入层:
layers {
name: "data"
type: DATA
top: "data"
top: "label"
data_param {
source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/train_db"
backend: LMDB
batch_size: 64
}
transform_param {
crop_size: 227
mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto"
mirror: true
}
include: { phase: TRAIN }
}
layers {
name: "data"
type: DATA
top: "data"
top: "label"
data_param {
source: "/home/gal/digits/digits/jobs/20181010-191058-21ab/val_db"
backend: LMDB
batch_size: 64
}
transform_param {
crop_size: 227
mean_file: "/home/gal/digits/digits/jobs/20181010-191058-21ab/mean.binaryproto"
mirror: true
}
include: { phase: TEST }
}
到
layer {
type: "HDF5Data"
top: "X" # same name as given in create_dataset!
top: "y"
hdf5_data_param {
source: "train_h5_list.txt" # do not give the h5 files directly, but the list.
batch_size: 32
}
include { phase:TRAIN }
}
layer {
type: "HDF5Data"
top: "X" # same name as given in create_dataset!
top: "y"
hdf5_data_param {
source: "val_h5_list.txt" # do not give the h5 files directly, but the list.
batch_size: 32
}
include { phase:TEST }
}
我猜 HDF5 不需要 mean.binaryproto?
接下来,输出层应该如何变化才能输出多个标签概率? 我想我需要交叉熵层而不是 softmax? 这是当前的输出层:
layers {
bottom: "prob"
bottom: "label"
top: "loss"
name: "loss"
type: SOFTMAX_LOSS
loss_weight: 1
}
layers {
name: "accuracy"
type: ACCURACY
bottom: "prob"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}