向自定义 resnet 18 架构添加一个简单的注意力层会导致前向传递错误

Adding a simple attention layer to a custom resnet 18 architecture causes error in forward pass

我正在 resnet18 自定义代码中添加以下代码

self.layer1 = self._make_layer(block, 64, layers[0]) ## code existed before
self.layer2 = self._make_layer(block, 128, layers[1], stride=2) ## code existed before
self.layer_attend1 =  nn.Sequential(nn.Conv2d(layers[0], layers[0], stride=2, padding=1, kernel_size=3),
                                     nn.AdaptiveAvgPool2d(1),
                                     nn.Softmax(1)) ## code added by me

以及在同一 resnet18 自定义代码中的正向传递 (def forward(self, x)) 中的以下内容:

x = self.layer1(x) ## the code existed before
x = self.layer_attend1(x)*x ## I added this code
x = self.layer2(x) ## the code existed before

我收到以下错误。我在添加这个注意力层之前没有报错。知道我该如何解决吗?

=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
=> loaded checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar' (epoch 5)
/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
  File "main.py", line 352, in <module>
    main()    
  File "main.py", line 153, in main
    test_acc = test(test_loader, tnet)
  File "main.py", line 248, in test
    embeddings.append(tnet.embeddingnet(images).data)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/research/code/fashion/fashion-compatibility/type_specific_network.py", line 101, in forward
    embedded_x = self.embeddingnet(x)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/research/code/fashion/fashion-compatibility/Resnet_18.py", line 110, in forward
    x = self.layer_attend1(x)*x #so we don;t use name x1
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 439, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 2, 3, 3], expected input[256, 64, 28, 28] to have 2 channels, but got 64 channels instead

在VSCode虽然我在有问题的层之前加了一个checkpoint,但是它连checkpoint都没有到

问题来自一个困惑:layers[0] 不是输出通道的数量,正如您可能预期的那样,而是该层将具有的块的数量。您实际需要的是使用 64,这是您自定义代码之前的 layer1 的输出通道数:

self.layer_attend1 =  nn.Sequential(nn.Conv2d(64, 64, stride=2, padding=1, kernel_size=3),
                                    nn.AdaptiveAvgPool2d(1),
                                    nn.Softmax(1)) ## code added by me