仅使用特定层的预训练 torchvision 网络

Question

我正在尝试仅在预训练的 torchvision Faster-RCNN 网络中使用某些层，该网络初始化为：

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.eval()

这行得通。但是，将 model.modules() 或 model.children() 传递给 nn.Sequential 会产生错误。即使通过整个模型也会导致错误，例如

model = torch.nn.Sequential(*model.modules())
model.eval()
# x is a [C, H, W] image
y = model(x)

导致

AttributeError: 'dict' object has no attribute 'dim'

和

model = torch.nn.Sequential(*model.children())
model.eval()
# x is a [C, H, W] image
y = model(x)

导致

TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not tuple

这让我感到困惑，因为我过去曾像那样修改过其他 PyTorch 预训练模型。我如何使用 FasterRCNN 预训练模型来创建一个新的（预训练）模型，该模型仅使用某些层，例如除了最后一层以外的所有层？

Answer 1

与其他简单的 CNN 模型不同，将基于 R-CNN 的检测器转换为简单的 nn.Sequential 模型并非易事。如果您查看 R-CNN ('generalized_rcnn.py') 的功能，您会发现输出特征（由 FCN backbone 计算）不仅传递给 RPN 组件，而是组合使用输入图像甚至使用目标（在训练期间）。

因此，我想如果你想更快地改变 R-CNN 的行为方式，你必须使用基础 class torchvision.models.detection.FasterRCNN() 并为其提供不同的 roi 池化参数.

仅使用特定层的预训练 torchvision 网络

Use only certain layers of pretrained torchvision network

machine-learning

image-processing

object-detection

computer-vision

pytorch