禁止在 PyTorch 神经网络的 CrossEntropyLoss 中使用 Softmax

Question

我知道当使用 nn.CrossEntropyLoss 作为损失函数时，不需要在神经网络的输出层中使用 nn.Softmax() 函数。

但是我需要这样做，有没有办法在 nn.CrossEntropyLoss 中抑制 softmax 的实现使用，而是在我的神经网络本身的输出层上使用 nn.Softmax()？

Motivation：我正在使用 shap 包来分析之后的特征影响，我只能将训练好的模型作为输入。然而，输出没有任何意义，因为我正在查看未绑定的值而不是概率。

示例：我想要一个介于 0 和 1 之间的值，而不是 -69.36 作为我模型的一个 class 的输出值，总计为 1 classes。由于我之后无法更改它，因此在训练期间输出需要已经像这样。

Answer 1

您可以使用 nn.NLLLoss()。 nn.CrossEntropyLoss 计算输入分数的 log softmax 并计算负对数似然损失。如果你已经有对数概率，你可以只使用 nn.NLLLoss()。

Here 是 PyTorchs 文档中的示例

m = nn.LogSoftmax(dim=1)
loss = nn.NLLLoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.tensor([1, 0, 4])
output = loss(m(input), target)

Answer 2

nn.CrossEntropyLoss 的文档说，

This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.

我建议你坚持使用CrossEntropyLoss作为损失标准。但是，您可以使用 softmax 函数将模型的输出转换为概率值。

请注意，您可以随时使用模型的输出值，不需要为此更改损失标准。

但是如果你仍然想在你的网络中使用Softmax()，那么你可以使用NLLLoss()作为损失标准，只应用log() before feeding model's output to the criterion function. Similarly, if you use LogSoftmax instead in your network, you can apply exp()来获得概率值。

更新:

要在 Softmax 输出上使用 log()，请执行以下操作：

torch.log(prob_scores + 1e-20)

通过在 prob_scores 中添加一个非常小的数字 (1e-20)，我们可以避免 log(0) 问题。

禁止在 PyTorch 神经网络的 CrossEntropyLoss 中使用 Softmax

Suppress use of Softmax in CrossEntropyLoss for PyTorch Neural Net

python

neural-network

softmax

cross-entropy

pytorch