向自定义 resnet 18 架构添加一个简单的注意力层会导致前向传递错误
Adding a simple attention layer to a custom resnet 18 architecture causes error in forward pass
我正在 resnet18 自定义代码中添加以下代码
self.layer1 = self._make_layer(block, 64, layers[0]) ## code existed before
self.layer2 = self._make_layer(block, 128, layers[1], stride=2) ## code existed before
self.layer_attend1 = nn.Sequential(nn.Conv2d(layers[0], layers[0], stride=2, padding=1, kernel_size=3),
nn.AdaptiveAvgPool2d(1),
nn.Softmax(1)) ## code added by me
以及在同一 resnet18 自定义代码中的正向传递 (def forward(self, x)
) 中的以下内容:
x = self.layer1(x) ## the code existed before
x = self.layer_attend1(x)*x ## I added this code
x = self.layer2(x) ## the code existed before
我收到以下错误。我在添加这个注意力层之前没有报错。知道我该如何解决吗?
=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
=> loaded checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar' (epoch 5)
/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
File "main.py", line 352, in <module>
main()
File "main.py", line 153, in main
test_acc = test(test_loader, tnet)
File "main.py", line 248, in test
embeddings.append(tnet.embeddingnet(images).data)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/research/code/fashion/fashion-compatibility/type_specific_network.py", line 101, in forward
embedded_x = self.embeddingnet(x)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/research/code/fashion/fashion-compatibility/Resnet_18.py", line 110, in forward
x = self.layer_attend1(x)*x #so we don;t use name x1
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 439, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 2, 3, 3], expected input[256, 64, 28, 28] to have 2 channels, but got 64 channels instead
在VSCode虽然我在有问题的层之前加了一个checkpoint,但是它连checkpoint都没有到
问题来自一个困惑:layers[0]
不是输出通道的数量,正如您可能预期的那样,而是该层将具有的块的数量。您实际需要的是使用 64
,这是您自定义代码之前的 layer1
的输出通道数:
self.layer_attend1 = nn.Sequential(nn.Conv2d(64, 64, stride=2, padding=1, kernel_size=3),
nn.AdaptiveAvgPool2d(1),
nn.Softmax(1)) ## code added by me
我正在 resnet18 自定义代码中添加以下代码
self.layer1 = self._make_layer(block, 64, layers[0]) ## code existed before
self.layer2 = self._make_layer(block, 128, layers[1], stride=2) ## code existed before
self.layer_attend1 = nn.Sequential(nn.Conv2d(layers[0], layers[0], stride=2, padding=1, kernel_size=3),
nn.AdaptiveAvgPool2d(1),
nn.Softmax(1)) ## code added by me
以及在同一 resnet18 自定义代码中的正向传递 (def forward(self, x)
) 中的以下内容:
x = self.layer1(x) ## the code existed before
x = self.layer_attend1(x)*x ## I added this code
x = self.layer2(x) ## the code existed before
我收到以下错误。我在添加这个注意力层之前没有报错。知道我该如何解决吗?
=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
=> loaded checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar' (epoch 5)
/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
File "main.py", line 352, in <module>
main()
File "main.py", line 153, in main
test_acc = test(test_loader, tnet)
File "main.py", line 248, in test
embeddings.append(tnet.embeddingnet(images).data)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/research/code/fashion/fashion-compatibility/type_specific_network.py", line 101, in forward
embedded_x = self.embeddingnet(x)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/research/code/fashion/fashion-compatibility/Resnet_18.py", line 110, in forward
x = self.layer_attend1(x)*x #so we don;t use name x1
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 439, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 2, 3, 3], expected input[256, 64, 28, 28] to have 2 channels, but got 64 channels instead
在VSCode虽然我在有问题的层之前加了一个checkpoint,但是它连checkpoint都没有到
问题来自一个困惑:layers[0]
不是输出通道的数量,正如您可能预期的那样,而是该层将具有的块的数量。您实际需要的是使用 64
,这是您自定义代码之前的 layer1
的输出通道数:
self.layer_attend1 = nn.Sequential(nn.Conv2d(64, 64, stride=2, padding=1, kernel_size=3),
nn.AdaptiveAvgPool2d(1),
nn.Softmax(1)) ## code added by me