TensorFlow API Slim：如何为 VGG-Net 16 设置 checkpoint_exclude_scopes 和 output_node_names？

Question

我目前正在尝试使用 TensorFlow API (https://github.com/tensorflow/models) 训练分类网络。在为我的数据集（存储在 research/slim/data 中）创建 TFrecords 之后，我使用以下命令训练网络：

python research/slim/train_image_classifier.py \
--train_dir=research/slim/training/current_model \
--dataset_name=my_dataset \
--dataset_split_name=train \
--dataset_dir=research/slim/data \
--model_name=vgg_16 \
--checkpoint_path=research/slim/training/vgg_16_2016_08_28/vgg_16.ckpt \
--checkpoint_exclude_scopes=vgg_16/fc7,vgg_16/fc8 \
--trainable_scopes=vgg_16/fc7,vgg_16/fc8 \
--batch_size=5 \
--log_every_n_steps=10 \
--max_number_of_steps=1000 \

这适用于多种分类网络（Inception、ResNet、MobileNet），但不适用于 VGG-Net。我微调了以下 VGG-Net 16 模型： http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz

总的来说，它训练这个模型是有效的，但是当我训练网络时，损失增加而不是减少。也许，是因为我选择了'checkpoint_exclude_scopes'.

将最后一个全连接层用作checkpoint_exclude_scopes是否正确？

对于参数 'output_node_names'，通过冻结图形会出现同样的问题。例如，对于 InceptionV3，它适用于 'output_node_names=InceptionV3/Predictions/Reshape_1'。但是如何为 VGG-Net 设置此参数。我尝试了以下方法：

python research/slim/freeze_graph.py
--input_graph=research/slim/training/current_model/graph.pb
--input_checkpoint=research/slim/training/current_model/model.ckpt
--input_binary=true 
--output_graph=research/slim/training/current_model/frozen_inference_graph.pb 
--output_node_names=vgg_16/fc8

我没有在VGG-Net模型中找到任何包含'Predictions'或'Logits'的层，所以我不确定。

感谢您的帮助！

Answer 1

我尝试运行 train_image_classifier.py 按照您的脚本进行一些更改，如下所述：

将 train_dir、dataset_dir 和 checkpoint_path 更改为我的本地路径
因为我在 CPU 上运行，所以在命令中添加了 --clone_on_cpu=True 参数
删除了参数 dataset_name=my_dataset 因为它对我来说是错误的

它运行很好。损失开始高达 448，然后慢慢减少，到第 1000 步结束时减少到 3.5。确实波动很大，但亏损的趋势是向下的。不确定为什么在尝试运行.

时看不到相同内容

关于你关于 checkpoint_exclude_scopes while training 和 output_node_names while freezing graph 的问题，我认为你对层的选择绝对没问题。但是，我宁愿只训练最后一个完全连接的层 (fc8) 以加快收敛速度。

TensorFlow API Slim：如何为 VGG-Net 16 设置 checkpoint_exclude_scopes 和 output_node_names？

TensorFlow API Slim: How to set checkpoint_exclude_scopes and output_node_names for VGG-Net 16?

python

classification

conv-neural-network

tensorflow

vgg-net