通道数不等的 U-Net 拼接步骤

Question

我正在尝试实现用于图像分割的 U-NET 架构，同时在扩展路径中实现裁剪和连接步骤，我无法理解不相等数量的通道是如何连接的。

根据架构，第一个上采样步骤的输入必须与收缩路径的相应输出连接，但问题是收缩路径中的通道数为 512，而在上采样步骤后为 1024，它们如何应该是 concatenated.My 裁剪和连接代码是 -

def crop_and_concat(self, upsampled, bypass, crop=False):
    if crop:
        c = (bypass.size()[2] - upsampled.size()[2]) // 2
        bypass = F.pad(bypass, (-c, -c, -c, -c))
    return torch.cat((upsampled, bypass), 1)

我收到的错误- RuntimeError: Given groups=1, weight of size 128 256 5 5, expected input[4, 384, 64, 64] to have 256 channels, but got 384 channels instead
我哪里做错了？

Answer 1

首先，对于类U-Net的架构，你不必那么严格，之后有很多衍生产品（参见示例fastai variation with PixelShuffle）。

对于编码器，在基本版本中，您的通道（每个块）：

1 - 64 - 128 - 256 - 512

标准卷积编码器。之后是1024.

的共享层

在解码器中，它向下移动，但是当您连接来自每个块的编码器状态时有更多通道。

它将是：

1024 -> 512 -> 512 (decoder) + 512 (encoder), 1024 total -> 512

512 -> 256 -> 256 (decoder) + 256 (encoder), 512 total -> 256

等等。

您遇到的情况是，来自解码器的 256 已被记入帐户，但来自编码器的 128 未被记入帐户。只需将您的频道增加到 256 + 128 并为您的 UNet 的每个块遵循上述方案。

通道数不等的 U-Net 拼接步骤

Concatenation step of U-Net for unequal number of channels

image-segmentation

deep-learning

unity3d-unet

pytorch