使用预训练 VGG-16 模型的 Caffe 形状不匹配错误

Question

我正在使用 PyCaffe 实现受 VGG 16 层网络启发的神经网络。我想使用他们 GitHub page 提供的预训练模型。通常这是通过匹配图层名称来实现的。

对于我的 "fc6" 层，我在 train.prototxt 文件中有以下定义：

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}

Here 是 VGG-16 部署架构的原型文件。请注意，他们的 prototxt 中的 "fc6" 与我的相同（除了学习率，但那是无关紧要的）。还值得注意的是，我的模型中的输入也都是相同的大小：3 通道 224x224px 图像。

我一直在密切关注 this tutorial，给我带来问题的代码块如下：

solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))
solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')
solver.test_nets[0].share_with(solver.net)
solver.step(1)

第一行加载我的求解器 prototxt，然后第二行从预训练模型 (VGG_ILSVRC_16_layers.caffemodel) 复制权重。当求解器运行s 时，我得到这个错误：

Cannot copy param 0 weights from layer 'fc6'; shape mismatch.  Source param 
shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728). 
To learn this layer's parameters from scratch rather than copying from a saved 
net, rename the layer.

要点是他们的模型期望图层的大小为 1x1x4096，而我的只有 4096。但我不知道如何更改它？

我在 Users Google 组中发现 this answer 指示我进行网络手术以在复制前重塑预训练模型，但为了做到这一点，我需要 lmdb 来自原始架构数据层的文件，我没有（当我尝试运行网络手术脚本时它会抛出错误）。

Answer 1

问题不在于 4096，而在于 25088。您需要根据输入特征图计算网络每一层的输出特征图。请注意，fc 层采用固定大小的输入，因此前一个 conv 层的输出必须与 fc 层所需的输入大小匹配。使用前一个 conv 层的输入特征图大小计算你的 fc6 输入特征图大小（这是前一个 conv 层的输出特征图）。公式如下：

H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1

Answer 2

如果您将图像裁剪为 224，而不是使用原始数据集完成的 227，则会出现此错误。调整一下，你应该可以开始了。

使用预训练 VGG-16 模型的 Caffe 形状不匹配错误

Caffe shape mismatch error using pretrained VGG-16 model

python

deep-learning

caffe

pycaffe

vgg-net