实现指定深度的卷积块的正确方法是什么?

What's the correct way to implement convolutional blocks of specified depth?

我正在尝试使用 PyTorch 中的卷积神经网络实现贝叶斯优化,具体来说,我正在尝试将网络结构从 Matlab BayesOptExperiment 转换为 PyTorch。我希望我的网络具有以下结构:

输入数据 -> convblock -> maxpool -> convblock -> maxpool -> convblock -> avgpool -> 展平 -> 线性 -> softmax

其中 convblock 包括:

[conv2Dlayer -> batch normalization layer -> ReLU],

重复了几次。当前版本仅在 section_depth = 1 达到 65-70% 左右的精度时才按预期工作,尽管如果我提高 convblock 的深度,精度会直线下降到 10% 左右。我肯定错过了什么,但我不确定那是什么。我的网络结构:

import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F

#...

class Net(nn.Module):
    def __init__(self, section_depth):
        super().__init__()
        #! define network architecture
        self.section_depth = section_depth
        self.num_filters = round(16/np.sqrt(self.section_depth))
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.avgpool = nn.AvgPool2d(kernel_size=8)
        self.block1 = nn.ModuleList()
        self.block2 = nn.ModuleList()
        self.block3 = nn.ModuleList()
        self.batchnorm1 = nn.BatchNorm2d(self.num_filters)
        self.batchnorm2 = nn.BatchNorm2d(2*self.num_filters)
        self.batchnorm3 = nn.BatchNorm2d(4*self.num_filters)
        for i in range(self.section_depth):
            channels1 = 3 if i==0 else self.num_filters
            channels2 = self.num_filters if i == 0 else 2*self.num_filters
            channels3 = 2*self.num_filters if i == 0 else 4*self.num_filters
            self.block1.append(nn.Conv2d(in_channels=channels1, out_channels=self.num_filters, kernel_size=3, padding='same'))
            self.block2.append(nn.Conv2d(in_channels=channels2, out_channels=2*self.num_filters, kernel_size=3, padding='same'))
            self.block3.append(nn.Conv2d(in_channels=channels3, out_channels=4*self.num_filters, kernel_size=3, padding='same'))
        self.fc1 = nn.Linear(4*self.num_filters, 10)  # ? number of outputs
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        for i in self.block1:
            x = F.relu(self.batchnorm1(i(x)))
        x = self.maxpool(x)
        for i in self.block2:
            x = F.relu(self.batchnorm2(i(x)))
        x = self.maxpool(x)
        for i in self.block3:
            x = F.relu(self.batchnorm3(i(x)))
        x = self.avgpool(x)
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = self.fc1(x)
        x = self.softmax(x)
        return x

如有任何帮助,我们将不胜感激。

好的,我明白了。事实证明批归一化层具有可学习的参数,因此我不得不为每个卷积层制作一个单独的批归一化层,而不是对整个 convblock 使用同一层。