变分自动编码器 (VAE) returns 一致的输出

Variational Autoencoder (VAE) returns consistent output

我正在使用 VAE 进行信号压缩和重建。我训练了 1600 个片段,但 1600 个重建信号的值非常相似。而且,同一批次的结果几乎是一致的。由于使用VAE,模型的损失函数包含二进制交叉熵(BCE),训练模型的输出应位于0到1之间(输入数据也归一化为0~1)。

VAE 模型(LSTM):

class LSTM_VAE(nn.Module):
def __init__(self,
             input_size=3000,
             hidden=[1024, 512, 256, 128, 64],
             latent_size=64,
             num_layers=8,
             bidirectional=True):
    super().__init__()

    self.input_size = input_size
    self.hidden = hidden
    self.latent_size = latent_size
    self.num_layers = num_layers
    self.bidirectional = bidirectional

    self.actv = nn.LeakyReLU()

    self.encode = nn.LSTM(input_size=self.input_size,
                          hidden_size=self.hidden[0],
                          num_layers=self.num_layers,
                          batch_first=True,
                          bidirectional=True)
    self.bn_encode = nn.BatchNorm1d(1)

    self.decode = nn.LSTM(input_size=self.latent_size,
                          hidden_size=self.hidden[2],
                          num_layers=self.num_layers,
                          batch_first=True,
                          bidirectional=True)
    self.bn_decode = nn.BatchNorm1d(1)

    self.fc1 = nn.Linear(self.hidden[0]*2, self.hidden[1])
    self.fc2 = nn.Linear(self.hidden[1], self.hidden[2])
    self.fc31 = nn.Linear(self.hidden[2], self.latent_size)
    self.fc32 = nn.Linear(self.hidden[2], self.latent_size)
    self.bn1 = nn.BatchNorm1d(1)
    self.bn2 = nn.BatchNorm1d(1)
    self.bn3 = nn.BatchNorm1d(1)

    self.fc4 = nn.Linear(self.hidden[2]*2, self.hidden[1])
    self.fc5 = nn.Linear(self.hidden[1], self.hidden[0])
    self.fc6 = nn.Linear(self.hidden[0], self.input_size)
    self.bn4 = nn.BatchNorm1d(1)
    self.bn5 = nn.BatchNorm1d(1)
    self.bn6 = nn.BatchNorm1d(1)

def encoder(self, x):
    x = torch.unsqueeze(x, 1)
    x, _ = self.encode(x)
    x = self.actv(x)
    x = self.fc1(x)
    x = self.actv(x)
    x = self.fc2(x)
    x = self.actv(x)

    mu = self.fc31(x)
    log_var = self.fc32(x)

    return mu, log_var

def decoder(self, z):
    z, _ = self.decode(z)
    z = self.bn_decode(z)
    z = self.actv(z)
    z = self.fc4(z)
    z = self.bn4(z)
    z = self.fc5(z)
    z = self.bn5(z)
    z = self.fc6(z)
    z = self.bn6(z)
    z = torch.sigmoid(z)

    return torch.squeeze(z)

def sampling(self, mu, log_var):
    std = torch.exp(0.5 * log_var)
    eps = torch.randn_like(std)

    return mu + eps * std

def forward(self, x):
    mu, log_var = self.encoder(x.view(-1, self.input_size))
    z = self.sampling(mu, log_var)
    z = self.decoder(z)

    return z, mu, log_var

损失函数和训练代码:

def lossF(recon_x, x, mu, logvar, input_size):
BCE = F.binary_cross_entropy(recon_x, x.view(-1, input_size), reduction='sum')
KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())

return BCE + KLD

optim = torch.optim.Adam(model.parameters(), lr=opt.lr)

for epoch in range(opt.epoch):
    for batch_idx, data in enumerate(train_set):
        data = data.to(device)
        optim.zero_grad()
        recon_x, mu, logvar = model(data)
        loss = lossF(recon_x, data, mu, logvar, opt.input_size)
        loss.backward()
        train_loss += loss.item()
        optim.step()

我参考了别人的示例代码构建了代码,只更改了很少的参数。我重建了代码,更改了数据集,更新了参数,但没有任何效果。如果您有任何解决此问题的建议,请告诉我。

我找到问题的原因了。事实证明,解码器模型导出 0.4 到 0.6 范围内的输出值以稳定 BCE 损失。即使预测正确,BCE 损失也不能为 0。此外,损失值与输出范围是非线性的。降低损失的最简单方法是将输出设为 0.5,我的模型做到了。 为了避免这个错误,我标准化了我的数据并添加了一些异常数据以避免 BCE 问题。 VAE 肯定是这么复杂的网络。