多个 HDF5 文件的 HDF5 数据层定义

Question

我知道 Caffe 不会让你拥有大于 2GB 的 HDF5 数据层。
我有一个大数据集，我将我的大数据集分成 5 个小于 2GB 的块。
我在 'train.txt' 文件中列出了五个文件。

如何在我的网络 prototxt 文件的 "HDF5Data" 层中定义它？
仅将它们全部列为顶部会产生错误。

有什么小例子吗？

谢谢！

干杯

Answer 1

您应该有文本文件 'train.txt' 包含以下内容

/path/to/first.h5
/path/to/second.h5
/path/to/third.h5
/path/to/fourth.h5
/path/to/fifth.h5

然后，作为 "HDF5Data" 层的 source 你应该给 only 'train.txt':

layer {
  type: "HDF5Data"
  name: "data"
  # put your "top" here, if you have several - then go ahead
  hdf5_data_param {
    source: "/path/to/train.txt"  # only the list file goes here.
  }
  include { phase: TRAIN }
}

如您所见，'/path/to/first.h5' 没有在 train.prototxt 中明确列出，仅在 train.txt.

中列出

多个 HDF5 文件的 HDF5 数据层定义

HDF5 data layer definition for multiple HDF5 files

machine-learning

hdf5

neural-network

deep-learning

caffe