在 Pytorch 中实现 U-Net(编码器部分)时出现形状错误
Shape Error while implementing U-Net (Encoder Part) in Pytorch
我正在尝试从头开始学习构建 U-NET architecture。我已经编写了这段代码,但问题是当我尝试 运行 检查编码器部分的输出时,我遇到了问题。当您 运行 下面的代码时,您将得到
import torch
import torch.nn as nn
batch = 1
channels = 3
width = 512 # same as height
image = torch.randn(batch, channels, width, width)
enc = Encoder(channels)
enc(image)
RuntimeError: Given groups=1, weight of size [128, 64, 3, 3], expected input[1, 3, 512, 512] to have 64 channels, but got 3 channels instead
代码如下:
class ConvolutionBlock(nn.Module):
'''
The basic Convolution Block Which Will have Convolution -> RelU -> Convolution -> RelU
'''
def __init__(self, in_channels, out_channels, upsample:bool = False,):
'''
args:
upsample: If True, then use TransposedConv2D (Means it being used in the decoder part) instead MaxPooling
batch_norm was introduced after UNET so they did not know if it existed. Might be useful
'''
super().__init__()
self.network = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size = 3, padding= 1), # padding is 0 by default, 1 means the input width, height == out width, height
nn.ReLU(),
nn.Conv2d(out_channels, out_channels, kernel_size = 3, padding = 1),
nn.ReLU(),
nn.MaxPool2d(kernel_size = 2, stride = 2) if not upsample else nn.ConvTranspose2d(out_channels, out_channels//2, kernel_size = 2, ) # As it is said in the paper that it TransPose2D halves the features
)
def forward(self, feature_map_x):
'''
feature_map_x could be the image itself or the
'''
return self.network(feature_map_x)
class Encoder(nn.Module):
'''
'''
def __init__(self, image_channels:int = 1, repeat:int = 4):
'''
In UNET, the features start at 64 and keeps getting twice the size of the previous one till it reached BottleNeck
'''
super().__init__()
in_channels = [image_channels,64, 128, 256, 512]
out_channels = [64, 128, 256, 512, 1024]
self.layers = nn.ModuleList(
[ConvolutionBlock(in_channels = in_channels[i], out_channels = out_channels[i]) for i in range(repeat+1)]
)
def forward(self, feature_map_x):
for layer in self.layers:
out = layer(feature_map_x)
return out
编辑:运行 下面的代码也给了我预期的信息:
in_ = [3,64, 128, 256, 512]
ou_ = [64, 128, 256, 512, 1024]
width = 512
from torchsummary import summary
for i in range(5):
cb = ConvolutionBlock(in_[i], ou_[i])
summary(cb, (in_[i],width,width))
print('#'*50)
Encoder
的 forward
中存在代码逻辑错误
我做到了:
for layer in self.layers:
out = layer(feature_map_x)
return out
但我应该使用 feature_map_x 作为输入,因为循环之前在原始特征图上迭代,但它应该得到前一个的输出层.
for layer in self.layers:
feature_map_x = layer(feature_map_x)
return feature_map_x
我正在尝试从头开始学习构建 U-NET architecture。我已经编写了这段代码,但问题是当我尝试 运行 检查编码器部分的输出时,我遇到了问题。当您 运行 下面的代码时,您将得到
import torch
import torch.nn as nn
batch = 1
channels = 3
width = 512 # same as height
image = torch.randn(batch, channels, width, width)
enc = Encoder(channels)
enc(image)
RuntimeError: Given groups=1, weight of size [128, 64, 3, 3], expected input[1, 3, 512, 512] to have 64 channels, but got 3 channels instead
代码如下:
class ConvolutionBlock(nn.Module):
'''
The basic Convolution Block Which Will have Convolution -> RelU -> Convolution -> RelU
'''
def __init__(self, in_channels, out_channels, upsample:bool = False,):
'''
args:
upsample: If True, then use TransposedConv2D (Means it being used in the decoder part) instead MaxPooling
batch_norm was introduced after UNET so they did not know if it existed. Might be useful
'''
super().__init__()
self.network = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size = 3, padding= 1), # padding is 0 by default, 1 means the input width, height == out width, height
nn.ReLU(),
nn.Conv2d(out_channels, out_channels, kernel_size = 3, padding = 1),
nn.ReLU(),
nn.MaxPool2d(kernel_size = 2, stride = 2) if not upsample else nn.ConvTranspose2d(out_channels, out_channels//2, kernel_size = 2, ) # As it is said in the paper that it TransPose2D halves the features
)
def forward(self, feature_map_x):
'''
feature_map_x could be the image itself or the
'''
return self.network(feature_map_x)
class Encoder(nn.Module):
'''
'''
def __init__(self, image_channels:int = 1, repeat:int = 4):
'''
In UNET, the features start at 64 and keeps getting twice the size of the previous one till it reached BottleNeck
'''
super().__init__()
in_channels = [image_channels,64, 128, 256, 512]
out_channels = [64, 128, 256, 512, 1024]
self.layers = nn.ModuleList(
[ConvolutionBlock(in_channels = in_channels[i], out_channels = out_channels[i]) for i in range(repeat+1)]
)
def forward(self, feature_map_x):
for layer in self.layers:
out = layer(feature_map_x)
return out
编辑:运行 下面的代码也给了我预期的信息:
in_ = [3,64, 128, 256, 512]
ou_ = [64, 128, 256, 512, 1024]
width = 512
from torchsummary import summary
for i in range(5):
cb = ConvolutionBlock(in_[i], ou_[i])
summary(cb, (in_[i],width,width))
print('#'*50)
Encoder
forward
中存在代码逻辑错误
我做到了:
for layer in self.layers:
out = layer(feature_map_x)
return out
但我应该使用 feature_map_x 作为输入,因为循环之前在原始特征图上迭代,但它应该得到前一个的输出层.
for layer in self.layers:
feature_map_x = layer(feature_map_x)
return feature_map_x