PyTorch Softmax 输出总和不为 1

Question

我开始收到目标 Dirichlet 分布和我模型的输出 Dirichlet 分布之间的负 KL 散度。网上有人建议这可能表明 Dirichlet 分布的参数总和不为 1。我认为这很荒谬，因为模型的输出是通过

output = F.softmax(self.weights(x), dim=1)

但仔细研究后，我发现 torch.all(torch.sum(output, dim=1) == 1.) returns 错了！查看有问题的行，我看到它是 tensor([0.0085, 0.9052, 0.0863], grad_fn=<SelectBackward>)。但是 torch.sum(output[5]) == 1. 产生 tensor(False).

我对 softmax 的误用是什么导致输出概率总和不为 1？

这是 PyTorch 版本 1.2.0+cpu。完整模型复制如下：

import torch
import torch.nn as nn
import torch.nn.functional as F



def assert_no_nan_no_inf(x):
    assert not torch.isnan(x).any()
    assert not torch.isinf(x).any()


class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.weights = nn.Linear(
            in_features=2,
            out_features=3)

    def forward(self, x):
        output = F.softmax(self.weights(x), dim=1)
        assert torch.all(torch.sum(output, dim=1) == 1.)
        assert_no_nan_no_inf(x)
        return output

Answer 1

这很可能是由于有限精度导致的浮点数值错误。

您不应该检查严格的不等式，而应该检查均方误差或某些在可接受范围内的东西。

例如：我得到 torch.norm(output.sum(dim=1)-1)/N 小于 1e-8。 N 是批量大小。

PyTorch Softmax 输出总和不为 1

PyTorch Softmax Output Doesn't Sum to 1

softmax

pytorch