softmax 的批量推断总和不为 1

Batch inference of softmax does not sum to 1

我正在使用 PyTorch 的 REINFORCE 算法。我注意到我的带有 Softmax 的简单网络的 batch inference/predictions 总和不等于 1(甚至不接近 1)。我附上了一个最低限度的工作代码,以便您可以重现它。我在这里错过了什么?

import numpy as np
import torch

obs_size = 9
HIDDEN_SIZE = 9
n_actions = 2

np.random.seed(0)

model = torch.nn.Sequential(
        torch.nn.Linear(obs_size, HIDDEN_SIZE),
        torch.nn.ReLU(),
        torch.nn.Linear(HIDDEN_SIZE, n_actions),
        torch.nn.Softmax(dim=0)
    )

state_transitions = np.random.rand(3, obs_size)

state_batch = torch.Tensor(state_transitions)
pred_batch = model(state_batch)  # WRONG PREDICTIONS!
print('wrong predictions:\n', *pred_batch.detach().numpy())
# [0.34072137 0.34721774] [0.30972624 0.30191955] [0.3495524 0.3508627]
# DOES NOT SUM TO 1 !!!

pred_batch = [model(s).detach().numpy() for s in state_batch]  # CORRECT PREDICTIONS
print('correct predictions:\n', *pred_batch)
# [0.5955179  0.40448207] [0.6574412  0.34255883] [0.624833   0.37516695]
# DOES SUM TO 1 AS EXPECTED

Although PyTorch lets us get away with it, we don’t actually provide an input with the right dimensionality. We have a model that takes one input and produces one output, but PyTorch nn.Module and its subclasses are designed to do so on multiple samples at the same time. To accommodate multiple samples, modules expect the zeroth dimension of the input to be the number of samples in the batch.

您的模型适用于每个单独的样本,这是一个很好的实现。您错误地指定了 softmax 的维度(跨批次而不是跨变量),因此当给定批次维度时,它计算跨样本而不是样本内的 softmax:

nn.Softmax requires us to specify the dimension along which the softmax function is applied:

softmax = nn.Softmax(dim=1)

In this case, we have two input vectors in two rows (just like when we work with batches), so we initialize nn.Softmax to operate along dimension 1.

torch.nn.Softmax(dim=0) 更改为 torch.nn.Softmax(dim=1) 以获得适当的结果。