如何从图像目录中为孪生网络创建 CaffeDB 训练数据

Question

我需要一些帮助来从包含图像和标签文本文件的普通目录中为 siamese CNN 创建 CaffeDB。最好是 python 的方法。
问题不在于遍历目录并制作成对图像。我的问题更多是用这些对制作 CaffeDB。
到目前为止，我只使用从图像目录创建 CaffeDB。
感谢您的帮助！

Answer 1

为什么不简单地使用旧 convert_imagest 制作两个数据集？

layer {
  name: "data_a"
  top: "data_a"
  top: "label_a"
  type: "Data"
  data_param { source: "/path/to/first/data_lmdb" }
  ...
}
layer {
  name: "data_b"
  top: "data_b"
  top: "label_b"
  type: "Data"
  data_param { source: "/path/to/second/data_lmdb" }
  ...
}

至于损失，由于每个示例都有一个 class 标签，因此您需要将 label_a 和 label_b 转换为 same_not_same_label。我建议您 "on-the-fly" 使用 python 图层。在 prototxt 添加对 python 层的调用：

layer {
  name: "a_b_to_same_not_same_label"
  type: "Python"
  bottom: "label_a"
  bottom: "label_b"
  top: "same_not_same_label"
  python_param { 
    # the module name -- usually the filename -- that needs to be in $PYTHONPATH
    module: "siamese"
    # the layer name -- the class name in the module
    layer: "SiameseLabels"
  }
  propagate_down: false
}

创建 siamese.py（确保它在您的 $PYTHONPATH 中）。在 siamese.py 你应该有层 class:

import sys, os
sys.path.insert(0,os.environ['CAFFE_ROOT'] + '/python')
import caffe
class SiameseLabels(caffe.Layer):
  def setup(self, bottom, top):
    if len(bottom) != 2:
       raise Exception('must have exactly two inputs')
    if len(top) != 1:
       raise Exception('must have exactly one output')
  def reshape(self,bottom,top):
    top[0].reshape( *bottom[0].shape )
  def forward(self,bottom,top):
    top[0].data[...] = (bottom[0].data == bottom[1].data).astype('f4')
  def backward(self,top,propagate_down,bottom):
      # no back prop
      pass

确保以不同的方式随机排列两组中的示例，以便得到 non-trivial 对。此外，如果您使用不同个示例构建第一和第二个数据集，那么您将在每个时期看到不同的对；）

确保构建网络以共享重复层的权重，有关详细信息，请参阅this tutorial。

如何从图像目录中为孪生网络创建 CaffeDB 训练数据

How to Create CaffeDB training data for siamese networks out of image directory

training-data

neural-network

deep-learning

caffe

conv-neural-network