网络手术：如何在caffe中重塑caffemodel文件的卷积层？

Question

我正在尝试重塑 caffemodel 卷积层的大小（这是 ). Although there is a tutorial on how to do net surgery 的后续问题，它仅显示如何将权重参数从一个 caffemodel 复制到另一个 caffemodel大小相同。
相反，我需要向我的卷积滤波器添加一个新通道（全部为 0），以便将其大小从当前 (64x3x3x3) 更改为(64x4x3x3)。

假设卷积层被称为'conv1'。这是我到目前为止尝试过的：

# Load the original network and extract the fully connected layers' parameters.
net = caffe.Net('../models/train.prototxt', 
                '../models/train.caffemodel', 
                caffe.TRAIN)

现在我可以执行此操作了：

net.blobs['conv1'].reshape(64,4,3,3);
net.save('myNewTrainModel.caffemodel');

但是保存的模型好像没有变化。我读到卷积的实际权重存储在 net.params['conv1'][0].data 而不是 net.blobs 中，但我不知道如何重塑 net.params 对象。有人有想法吗？

Answer 1

如您所见，net.blobs 不存储已学习的 parameters/weights，而是存储在网络输入上应用 filters/activations 的结果。学习的权重存储在 net.params 中。（有关详细信息，请参阅 this）。

AFAIK，您不能直接 reshape net.params 并添加频道。
你可以做的是有两个网络 deploy_trained_net_with_3ch.prototxt 和 deploy_empty_net_with_4ch.prototxt。除了输入形状定义和第一层名称之外，这两个文件几乎完全相同。
然后你可以加载 both 网络到 python 并复制相关部分：

net3ch = caffe.Net('deploy_trained_net_with_3ch.prototxt', 'train.caffemodel', caffe.TEST) 
net4ch = caffe.Net('deploy_empty_net_with_4ch.prototxt', 'train.caffemodel', caffe.TEST)

因为所有层名称都相同（除了 conv1）net4ch.params 将具有 train.caffemodel 的权重。至于第一层，您现在可以手动复制相关部分：

net4ch.params['conv1_4ch'][0].data[:,:3,:,:] = net3ch.params['conv1'][0].data[...]

最后：

net4ch.save('myNewTrainModel.caffemodel')

网络手术：如何在caffe中重塑caffemodel文件的卷积层？

Net surgery: How to reshape a convolution layer of a caffemodel file in caffe?

machine-learning

neural-network

deep-learning

caffe

pycaffe