运行自定义数据集中的模型 (2 类) 在测试阶段出错（MXNet 框架）

Question

我是运行我自己数据集的模型（该项目是为 training/testing 使用 ImageNet 实现的）和 2 类。我已经进行了所有更改（在配置文件等中），但在训练完成（成功）后，开始测试时出现以下错误：

wrote gt roidb to ./data/cache/ImageNetVID_DET_val_gt_roidb.pkl
Traceback (most recent call last):
  File "experiments/dff_rfcn/dff_rfcn_end2end_train_test.py", line 20, in <module>
    test.main()
  File "experiments/dff_rfcn/../../dff_rfcn/test.py", line 53, in main
    args.vis, args.ignore_cache, args.shuffle, config.TEST.HAS_RPN, config.dataset.proposal, args.thresh, logger=logger, output_path=final_output_path)
  File "experiments/dff_rfcn/../../dff_rfcn/function/test_rcnn.py", line 68, in test_rcnn
    roidbs_seg_lens[gpu_id] += x['frame_seg_len']
KeyError: 'frame_seg_len'

我在运行之前清理了缓存文件。正如我在之前的主题中所读到的，这可能是缓存中以前的数据集 .pkl 文件的问题。是什么导致了这个错误？我还想提一下，我更改了为神经网络提供数据的 .txt 文件名（如果这很重要），并且训练顺利完成。这是我第一次运行深度学习项目，所以请表示理解。

Answer 1

MXNet 通常使用 pickle 以外的方法直接进行模型架构和训练权重的序列化。

使用 Gluon API，您可以使用 .save_params() and then load the weights from a file with .load_params(). You 'save' the model architecture by keeping the code used to define the model. See and example of this here.

将模型的权重保存到文件（即 Block）中

使用模块 API，您可以在每个时期结束时创建检查点，这将保存符号（即模型架构）和参数（即模型权重）。参见 here。

checkpoint = mx.callback.do_checkpoint(model_prefix)
mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)

然后您可以加载给定检查点的模型（例如本例中的 42）

sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 42)
mod.set_params(arg_params, aux_params)

运行自定义数据集中的模型 (2 类) 在测试阶段出错（MXNet 框架）

Running model in custom Data Set(2 classes) error in testing phase (MXNet framework)

python

training-data

neural-network

deep-learning

mxnet

运行 自定义数据集中的模型 (2 类) 在测试阶段出错（MXNet 框架）

Running model in custom Data Set(2 classes) error in testing phase (MXNet framework)

python

training-data

neural-network

deep-learning

mxnet

运行自定义数据集中的模型 (2 类) 在测试阶段出错（MXNet 框架）