为什么numpy和pytorch在均值和方差归一化后给出不同的结果？

Question

我正在解决一个问题，其中矩阵必须按行进行 mean-var 归一化。还要求在将每一行拆分为小批量之后应用归一化。该代码似乎适用于 Numpy，但不适用于 Pytorch（这是训练所必需的）。 Pytorch 和 Numpy 的结果似乎不同。任何帮助将不胜感激。

示例代码：

import numpy as np
import torch


def normalize(x, bsize, eps=1e-6):
    nc = x.shape[1]
    if nc % bsize != 0:
        raise Exception(f'Number of columns must be a multiple of bsize')
    x = x.reshape(-1, bsize)
    m = x.mean(1).reshape(-1, 1)
    s = x.std(1).reshape(-1, 1)
    n = (x - m) / (eps + s)
    n = n.reshape(-1, nc)
    return n

# numpy
a = np.float32(np.random.randn(8, 8))
n1 = normalize(a, 4)
# torch
b = torch.tensor(a)
n2 = normalize(b, 4)
n2 = n2.numpy()

print(abs(n1-n2).max())

Answer 1

在第一个示例中，您使用 a 调用 normalize，一个 numpy.ndarray，而在第二个示例中，您使用 b 调用 normalize，一个torch.Tensor.

根据 torch.std, Bessel’s correction is used by default to measure the standard deviation. As such the default behavior between numpy.ndarray.std and torch.Tensor.std 的文档页面不同。

If unbiased is True, Bessel’s correction will be used. Otherwise, the sample deviation is calculated, without any correction.

torch.std(input, dim, unbiased, keepdim=False, *, out=None) → Tensor
Parameters

input (Tensor) – the input tensor.

unbiased (bool) – whether to use Bessel’s correction (δN = 1).

你可以自己试试：

>>> a.std(), b.std(unbiased=True), b.std(unbiased=False)
(0.8364538, tensor(0.8942), tensor(0.8365))

为什么numpy和pytorch在均值和方差归一化后给出不同的结果？

Why does numpy and pytorch give different results after mean and variance normalization?

numpy

normalization

pytorch