Caffe Python 层中的反向传递不是 called/working 吗?
Backward pass in Caffe Python Layer is not called/working?
我尝试使用 Caffe 在 Python 中实现一个简单的损失层,但没有成功。作为参考,我发现在 Python 中实现了几个层,包括 here, here and here.
从 Caffe documentation/examples 提供的 EuclideanLossLayer
开始,我无法让它工作并开始调试。即使使用这个简单的 TestLayer
:
def setup(self, bottom, top):
"""
Checks the correct number of bottom inputs.
:param bottom: bottom inputs
:type bottom: [numpy.ndarray]
:param top: top outputs
:type top: [numpy.ndarray]
"""
print 'setup'
def reshape(self, bottom, top):
"""
Make sure all involved blobs have the right dimension.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'reshape'
top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
def forward(self, bottom, top):
"""
Forward propagation.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'forward'
top[0].data[...] = bottom[0].data
def backward(self, top, propagate_down, bottom):
"""
Backward pass.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param propagate_down:
:type propagate_down:
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'backward'
bottom[0].diff[...] = top[0].diff[...]
我无法让 Python 层工作。学习任务相当简单,因为我只是想预测一个实数值是正数还是负数。对应的数据生成如下,写入LMDBs:
N = 10000
N_train = int(0.8*N)
images = []
labels = []
for n in range(N):
image = (numpy.random.rand(1, 1, 1)*2 - 1).astype(numpy.float)
label = int(numpy.sign(image))
images.append(image)
labels.append(label)
写入LMDB应该是正确的,用Caffe提供的MNIST数据集测试没有问题。网络定义如下:
net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB,
source = lmdb_path, ntop = 2)
net.fc1 = caffe.layers.Python(net.data, python_param = dict(module = 'tools.layers', layer = 'TestLayer'))
net.score = caffe.layers.TanH(net.fc1)
net.loss = caffe.layers.EuclideanLoss(net.score, net.labels)
求解是手动完成的:
for iteration in range(iterations):
solver.step(step)
对应的prototxt文件如下:
solver.prototxt
:
weight_decay: 0.0005
test_net: "tests/test.prototxt"
snapshot_prefix: "tests/snapshot_"
max_iter: 1000
stepsize: 1000
base_lr: 0.01
snapshot: 0
gamma: 0.01
solver_mode: CPU
train_net: "tests/train.prototxt"
test_iter: 0
test_initialization: false
lr_policy: "step"
momentum: 0.9
display: 100
test_interval: 100000
train.prototxt
:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
test.prototxt
:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/test_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
我试着追踪它,在 TestLayer
的 backward
和 foward
方法中添加调试消息,在求解过程中只调用 forward
方法(注意不执行任何测试,调用只能与解决相关)。同样,我在 python_layer.hpp
:
中添加了调试消息
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
LOG(INFO) << "cpp forward";
self_.attr("forward")(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
LOG(INFO) << "cpp backward";
self_.attr("backward")(top, propagate_down, bottom);
}
同样,只执行前向传球。当我删除 TestLayer
中的 backward
方法时,求解仍然有效。删除 forward
方法时,会抛出错误,因为 forward
未实现。我希望 backward
也一样,所以似乎向后传递根本没有执行。切换回常规层并添加调试消息,一切正常。
我觉得我遗漏了一些简单或基本的东西,但我已经好几天没能解决这个问题了。因此,我们将不胜感激任何帮助或提示。
谢谢!
这是预期的行为,因为您没有任何层 "below" 您的 python 层实际上需要梯度来计算权重更新。 Caffe 注意到了这一点并跳过了这些层的反向计算,因为这会浪费时间。
如果在网络初始化时日志中需要反向计算,Caffe 会打印所有层。
在您的情况下,您应该会看到如下内容:
fc1 does not need backward computation.
如果您将 "InnerProduct" 或 "Convolution" 层放在 "Python" 层下方(例如 Data->InnerProduct->Python->Loss
),则需要向后计算并调用您的向后方法。
除了的答案之外,您还可以通过指定
强制caffe进行backprob
force_backward: true
在你的网络原型中。
有关详细信息,请参阅 caffe.proto
中的评论。
即使我按照 David Stutz 的建议设置了 force_backward: true
,我的也没有工作。我发现 here and here 我忘记在目标 class 的索引处将最后一层的差异设置为 1。
正如 Mohit Jain 在他的 caffe-users 回答中所描述的那样,如果您正在对虎斑猫进行 ImageNet class 化,则在进行前向传递之后,您必须执行如下操作:
net.blobs['prob'].diff[0][281] = 1 # 281 is tabby cat. diff shape: (1, 1000)
请注意,您必须根据最后一层的名称更改 'prob'
,通常是 softmax 和 'prob'
。
这是一个基于我的示例:
deploy.prototxt(它松散地基于 VGG16 只是为了显示文件的结构,但我没有测试它):
name: "smaller_vgg"
input: "data"
force_backward: true
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "pool1"
top: "fc1"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "fc1"
top: "fc1"
}
layer {
name: "drop1"
type: "Dropout"
bottom: "fc1"
top: "fc1"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
inner_product_param {
num_output: 1000
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc2"
top: "prob"
}
main.py:
import caffe
prototxt = 'deploy.prototxt'
model_file = 'smaller_vgg.caffemodel'
net = caffe.Net(model_file, prototxt, caffe.TRAIN) # not sure if TEST works as well
image = cv2.imread('tabbycat.jpg', cv2.IMREAD_UNCHANGED)
net.blobs['data'].data[...] = image[np.newaxis, np.newaxis, :]
net.blobs['prob'].diff[0, 298] = 1
net.forward()
backout = net.backward()
# access grad from backout['data'] or net.blobs['data'].diff
我尝试使用 Caffe 在 Python 中实现一个简单的损失层,但没有成功。作为参考,我发现在 Python 中实现了几个层,包括 here, here and here.
从 Caffe documentation/examples 提供的 EuclideanLossLayer
开始,我无法让它工作并开始调试。即使使用这个简单的 TestLayer
:
def setup(self, bottom, top):
"""
Checks the correct number of bottom inputs.
:param bottom: bottom inputs
:type bottom: [numpy.ndarray]
:param top: top outputs
:type top: [numpy.ndarray]
"""
print 'setup'
def reshape(self, bottom, top):
"""
Make sure all involved blobs have the right dimension.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'reshape'
top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
def forward(self, bottom, top):
"""
Forward propagation.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'forward'
top[0].data[...] = bottom[0].data
def backward(self, top, propagate_down, bottom):
"""
Backward pass.
:param bottom: bottom inputs
:type bottom: caffe._caffe.RawBlobVec
:param propagate_down:
:type propagate_down:
:param top: top outputs
:type top: caffe._caffe.RawBlobVec
"""
print 'backward'
bottom[0].diff[...] = top[0].diff[...]
我无法让 Python 层工作。学习任务相当简单,因为我只是想预测一个实数值是正数还是负数。对应的数据生成如下,写入LMDBs:
N = 10000
N_train = int(0.8*N)
images = []
labels = []
for n in range(N):
image = (numpy.random.rand(1, 1, 1)*2 - 1).astype(numpy.float)
label = int(numpy.sign(image))
images.append(image)
labels.append(label)
写入LMDB应该是正确的,用Caffe提供的MNIST数据集测试没有问题。网络定义如下:
net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB,
source = lmdb_path, ntop = 2)
net.fc1 = caffe.layers.Python(net.data, python_param = dict(module = 'tools.layers', layer = 'TestLayer'))
net.score = caffe.layers.TanH(net.fc1)
net.loss = caffe.layers.EuclideanLoss(net.score, net.labels)
求解是手动完成的:
for iteration in range(iterations):
solver.step(step)
对应的prototxt文件如下:
solver.prototxt
:
weight_decay: 0.0005
test_net: "tests/test.prototxt"
snapshot_prefix: "tests/snapshot_"
max_iter: 1000
stepsize: 1000
base_lr: 0.01
snapshot: 0
gamma: 0.01
solver_mode: CPU
train_net: "tests/train.prototxt"
test_iter: 0
test_initialization: false
lr_policy: "step"
momentum: 0.9
display: 100
test_interval: 100000
train.prototxt
:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
test.prototxt
:
layer {
name: "data"
type: "Data"
top: "data"
top: "labels"
data_param {
source: "tests/test_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "fc1"
type: "Python"
bottom: "data"
top: "fc1"
python_param {
module: "tools.layers"
layer: "TestLayer"
}
}
layer {
name: "score"
type: "TanH"
bottom: "fc1"
top: "score"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "score"
bottom: "labels"
top: "loss"
}
我试着追踪它,在 TestLayer
的 backward
和 foward
方法中添加调试消息,在求解过程中只调用 forward
方法(注意不执行任何测试,调用只能与解决相关)。同样,我在 python_layer.hpp
:
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
LOG(INFO) << "cpp forward";
self_.attr("forward")(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
LOG(INFO) << "cpp backward";
self_.attr("backward")(top, propagate_down, bottom);
}
同样,只执行前向传球。当我删除 TestLayer
中的 backward
方法时,求解仍然有效。删除 forward
方法时,会抛出错误,因为 forward
未实现。我希望 backward
也一样,所以似乎向后传递根本没有执行。切换回常规层并添加调试消息,一切正常。
我觉得我遗漏了一些简单或基本的东西,但我已经好几天没能解决这个问题了。因此,我们将不胜感激任何帮助或提示。
谢谢!
这是预期的行为,因为您没有任何层 "below" 您的 python 层实际上需要梯度来计算权重更新。 Caffe 注意到了这一点并跳过了这些层的反向计算,因为这会浪费时间。
如果在网络初始化时日志中需要反向计算,Caffe 会打印所有层。 在您的情况下,您应该会看到如下内容:
fc1 does not need backward computation.
如果您将 "InnerProduct" 或 "Convolution" 层放在 "Python" 层下方(例如 Data->InnerProduct->Python->Loss
),则需要向后计算并调用您的向后方法。
除了
force_backward: true
在你的网络原型中。
有关详细信息,请参阅 caffe.proto
中的评论。
即使我按照 David Stutz 的建议设置了 force_backward: true
,我的也没有工作。我发现 here and here 我忘记在目标 class 的索引处将最后一层的差异设置为 1。
正如 Mohit Jain 在他的 caffe-users 回答中所描述的那样,如果您正在对虎斑猫进行 ImageNet class 化,则在进行前向传递之后,您必须执行如下操作:
net.blobs['prob'].diff[0][281] = 1 # 281 is tabby cat. diff shape: (1, 1000)
请注意,您必须根据最后一层的名称更改 'prob'
,通常是 softmax 和 'prob'
。
这是一个基于我的示例:
deploy.prototxt(它松散地基于 VGG16 只是为了显示文件的结构,但我没有测试它):
name: "smaller_vgg"
input: "data"
force_backward: true
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "pool1"
top: "fc1"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "fc1"
top: "fc1"
}
layer {
name: "drop1"
type: "Dropout"
bottom: "fc1"
top: "fc1"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
inner_product_param {
num_output: 1000
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc2"
top: "prob"
}
main.py:
import caffe
prototxt = 'deploy.prototxt'
model_file = 'smaller_vgg.caffemodel'
net = caffe.Net(model_file, prototxt, caffe.TRAIN) # not sure if TEST works as well
image = cv2.imread('tabbycat.jpg', cv2.IMREAD_UNCHANGED)
net.blobs['data'].data[...] = image[np.newaxis, np.newaxis, :]
net.blobs['prob'].diff[0, 298] = 1
net.forward()
backout = net.backward()
# access grad from backout['data'] or net.blobs['data'].diff