如何比较 Faster RCNN 对象检测模型中的训练和测试性能
How to compare training and test performance in a Faster RCNN object detection model
我正在学习使用 PyTorch 针对自定义数据集实现 Faster RCNN 的教程here。
这是我的训练循环:
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
# FOR GPU
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
# Train the model
loss_dict = model(images, targets)
# reduce losses over all GPUs for logging purposes
losses = sum(loss for loss in loss_dict.values())
loss_dict_reduced = reduce_dict(loss_dict)
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
loss_value = losses_reduced.item()
指标记录器(定义 here)在训练期间向控制台输出以下内容:
Epoch: [0] [ 0/226] eta: 0:07:57 lr: 0.000027 loss: 6.5019 (6.5019) loss_classifier: 0.8038 (0.8038) loss_box_reg: 0.1398 (0.1398) loss_objectness: 5.2717 (5.2717) loss_rpn_box_reg: 0.2866 (0.2866) time: 2.1142 data: 0.1003 max mem: 3827
Epoch: [0] [ 30/226] eta: 0:02:28 lr: 0.000693 loss: 1.3016 (2.4401) loss_classifier: 0.2914 (0.4067) loss_box_reg: 0.2294 (0.2191) loss_objectness: 0.3558 (1.2913) loss_rpn_box_reg: 0.3749 (0.5230) time: 0.7128 data: 0.0923 max mem: 4341
一个纪元结束后,我调用一个 evaluate method 输出以下内容:
Test: [ 0/100] eta: 0:00:25 model_time: 0.0880 (0.0880) evaluator_time: 0.1400 (0.1400) time: 0.2510 data: 0.0200 max mem: 4703
Test: [ 99/100] eta: 0:00:00 model_time: 0.0790 (0.0786) evaluator_time: 0.0110 (0.0382) time: 0.1528 data: 0.0221 max mem: 4703
Test: Total time: 0:00:14 (0.1401 s / it)
Averaged stats: model_time: 0.0790 (0.0786) evaluator_time: 0.0110 (0.0382)
Accumulating evaluation results...
DONE (t=0.11s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.263
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.346
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.304
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.308
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.013
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.175
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.311
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.264
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.351
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.086
我对训练和测试期间使用的不同指标感到有点困惑 - 我想绘制训练 + 验证损失(或等效的 IoU 值),这样我就可以可视化训练和测试性能,以及检查如果发生任何过度拟合。
我的问题是,如何比较模型的训练和测试性能?
evaluate()
函数here doesn't calculate any loss. And look at how the loss is calculate in train_one_epoch()
here,你实际上需要模型处于train
模式。并使其像 train_one_epoch()
除了不更新权重,如
@torch.no_grad()
def evaluate_loss(model, data_loader, device):
model.train()
metric_logger = utils.MetricLogger(delimiter=" ")
header = 'Test:'
for images, targets in metric_logger.log_every(data_loader, 100, header):
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
loss_dict = model(images, targets)
losses = sum(loss for loss in loss_dict.values())
# reduce losses over all GPUs for logging purposes
loss_dict_reduced = utils.reduce_dict(loss_dict)
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
但由于您需要模型处于 eval
模式才能获得边界框。如果您需要 mAP,您也需要原始代码的循环。
我正在学习使用 PyTorch 针对自定义数据集实现 Faster RCNN 的教程here。
这是我的训练循环:
for images, targets in metric_logger.log_every(data_loader, print_freq, header):
# FOR GPU
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
# Train the model
loss_dict = model(images, targets)
# reduce losses over all GPUs for logging purposes
losses = sum(loss for loss in loss_dict.values())
loss_dict_reduced = reduce_dict(loss_dict)
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
loss_value = losses_reduced.item()
指标记录器(定义 here)在训练期间向控制台输出以下内容:
Epoch: [0] [ 0/226] eta: 0:07:57 lr: 0.000027 loss: 6.5019 (6.5019) loss_classifier: 0.8038 (0.8038) loss_box_reg: 0.1398 (0.1398) loss_objectness: 5.2717 (5.2717) loss_rpn_box_reg: 0.2866 (0.2866) time: 2.1142 data: 0.1003 max mem: 3827
Epoch: [0] [ 30/226] eta: 0:02:28 lr: 0.000693 loss: 1.3016 (2.4401) loss_classifier: 0.2914 (0.4067) loss_box_reg: 0.2294 (0.2191) loss_objectness: 0.3558 (1.2913) loss_rpn_box_reg: 0.3749 (0.5230) time: 0.7128 data: 0.0923 max mem: 4341
一个纪元结束后,我调用一个 evaluate method 输出以下内容:
Test: [ 0/100] eta: 0:00:25 model_time: 0.0880 (0.0880) evaluator_time: 0.1400 (0.1400) time: 0.2510 data: 0.0200 max mem: 4703
Test: [ 99/100] eta: 0:00:00 model_time: 0.0790 (0.0786) evaluator_time: 0.0110 (0.0382) time: 0.1528 data: 0.0221 max mem: 4703
Test: Total time: 0:00:14 (0.1401 s / it)
Averaged stats: model_time: 0.0790 (0.0786) evaluator_time: 0.0110 (0.0382)
Accumulating evaluation results...
DONE (t=0.11s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.263
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.346
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.304
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.308
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.013
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.175
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.311
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.264
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.351
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.086
我对训练和测试期间使用的不同指标感到有点困惑 - 我想绘制训练 + 验证损失(或等效的 IoU 值),这样我就可以可视化训练和测试性能,以及检查如果发生任何过度拟合。
我的问题是,如何比较模型的训练和测试性能?
evaluate()
函数here doesn't calculate any loss. And look at how the loss is calculate in train_one_epoch()
here,你实际上需要模型处于train
模式。并使其像 train_one_epoch()
除了不更新权重,如
@torch.no_grad()
def evaluate_loss(model, data_loader, device):
model.train()
metric_logger = utils.MetricLogger(delimiter=" ")
header = 'Test:'
for images, targets in metric_logger.log_every(data_loader, 100, header):
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
loss_dict = model(images, targets)
losses = sum(loss for loss in loss_dict.values())
# reduce losses over all GPUs for logging purposes
loss_dict_reduced = utils.reduce_dict(loss_dict)
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
但由于您需要模型处于 eval
模式才能获得边界框。如果您需要 mAP,您也需要原始代码的循环。