使用 Pytorch Lightning 的 forward() 没有为单个 VS 多个图像提供一致的二进制分类结果

Question

我训练了一个变分自动编码器 (VAE)，在编码器之后有一个额外的全连接层用于二值图像分类。它是使用 PyTorch Lightning 设置的。编码器/解码器来自 PyTorch Lightning Bolts 存储库 resnet18。

from pl_bolts.models.autoencoders.components import (
    resnet18_encoder,
    resnet18_decoder
)

class VariationalAutoencoder(LightningModule):

...

    self.first_conv: bool = False
    self.maxpool1: bool = False
    self.enc_out_dim: int = 512
    self.encoder = resnet18_encoder(first_conv, maxpool1)
    self.fc_object_identity = nn.Linear(self.enc_out_dim, 1)


    def forward(self, x):
        x_encoded = self.encoder(x)
        mu = self.fc_mu(x_encoded)
        log_var = self.fc_var(x_encoded)
        p, q, z = self.sample(mu, log_var)

        x_classification_score = torch.sigmoid(self.fc_object_identity(x_encoded))

        return self.decoder(z), x_classification_score

variational_autoencoder = VariationalAutoencoder.load_from_checkpoint(
        checkpoint_path=str(checkpoint_file_path)
    )

with torch.no_grad():
    predicted_images, classification_score = variational_autoencoder(test_images)

当通过 forward() 时，重建对于单张图像和多张图像效果很好。但是，当我将多个图像传递给 forward() 时，我得到的分类分数结果与传递单个图像张量的结果不同：

# Image 1 (class=1) [1, 3, 64, 64]
x_classification_score = 0.9857

# Image 2 (class=0) [1, 3, 64, 64]
x_classification_score = 0.0175

# Image 1 and 2 [2, 3, 64, 64]
x_classification_score =[[0.8943],
                         [0.1736]]

为什么会这样？

Answer 1

您正在使用 resnet18，其中有一个 torch.nn.BatchNorm2d 层。

无论处于 train 还是 eval 模式，其行为都会发生变化。它在训练期间计算批次的均值和方差 ，因此其输出取决于该批次中的示例。

在评估模式下，mean 和 variance 在训练期间通过 移动平均线收集 被使用，它是独立于批次的，因此结果是相同的。

使用 Pytorch Lightning 的 forward() 没有为单个 VS 多个图像提供一致的二进制分类结果

forward() using Pytorch Lightning not giving consistent binary classification results for single VS multiple images

classification

autoencoder

pytorch

pytorch-lightning

image-classification