Torchvision 0.2.1 transforms.Normalize 无法按预期工作
Torchvision 0.2.1 transforms.Normalize does not work as expected
我正在尝试使用 Pytorch 编写新代码。在这段代码中,为了加载数据集 (CIFAR10),我使用了 torchvision 的数据集。我定义了两个转换函数 ToTensor() 和 Normalize()。规范化后,我希望数据集中的数据应该在 0 和 1 之间。但最大值仍然是 255。我还在 transforms.py(Lib\site-packages\torchvision\transforms\transforms.py)。 运行 代码也不会打印此打印件。不确定发生了什么。我在互联网上访问的每个页面,都提到了与我几乎相同的用法。例如我访问过的一些网站
https://github.com/adventuresinML/adventures-in-ml-code/blob/master/pytorch_nn.py
https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py
我的代码如下。这会读取使用和不使用 Normalize 的数据集,然后打印一些统计数据。打印的最小值和最大值是数据是否标准化的指标。
import torchvision as tv
import numpy as np
dataDir = 'D:\general\ML_DL\datasets\CIFAR'
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)
输出看起来像,
[ 0.49139968 0.48215841 0.44653091]
0
255
(50000, 32, 32, 3)
[ 0.49139968 0.48215841 0.44653091]
0
255
(50000, 32, 32, 3)
请帮助我更好地理解这一点。由于我尝试过的大多数功能最终都会得到类似的结果——例如灰度、CenterCrop。
因此,您在代码中制定了您希望如何处理数据的计划。您已经创建了一个数据管道,您的数据将通过该管道流动并应用多个转换。
但是,您忘记调用 torch.utils.data.DataLoader
。在调用它之前,不会应用对数据的转换。您可以阅读更多相关信息 here。
现在,当我们将上面的内容添加到您的代码中时,如下所示 -
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(),
tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet = tv.datasets.CIFAR10(root=dataDir, train=True,
download=False, transform=trainTransform)
dataloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=False, num_workers=4)
和如下打印图像 -
images, labels = iter(dataloader).next()
print images
print images.max()
print images.min()
我们得到 Tensors
已应用的转换。
一小段输出
[[ 1.8649, 1.8198, 1.8348, ..., 0.3924, 0.3774, 0.2572],
[ 1.9701, 1.9550, 1.9851, ..., 0.7230, 0.6929, 0.6629],
[ 2.0001, 1.9550, 2.0001, ..., 0.7831, 0.7530, 0.7079],
...,
[-0.8096, -1.0049, -1.0350, ..., -1.3355, -1.3655, -1.4256],
[-0.7796, -0.8697, -0.9749, ..., -1.2754, -1.4557, -1.5609],
[-0.7645, -0.7946, -0.9298, ..., -1.4106, -1.5308, -1.5909]]]])
tensor(2.1309)
tensor(-1.9895)
其次,transforms.Normalize(mean,std)
适用input[channel] = (input[channel] - mean[channel]) / std[channel]
,所以根据我们提供的均值和标准差,我们无法得到(0,1)
范围内转换后的值。如果您想要 (-1,1)
之间的值,您可以使用以下 -
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(),
tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
希望对您有所帮助! :)
看起来在没有归一化的情况下读取并转换为张量本身时,它们会自动在 0 到 1 的范围内归一化。当我们应用规范化时,它会应用您在此数据上提到的公式,范围从 0 到 1。下面是修改后的工作代码,其中包含一些打印语句,显示何时调用 Normalize class 中的“__call__”函数,以及显示值如何 normalized.The 第一个值是 0.2314。用 0.5 归一化使其 (0.2314-0.5)/0.5 = -0.5372。张量值的第一次打印和第二次打印显示了这一点。
代码
import torchvision as tv
import numpy as np
import torch.utils.data as data
dataDir = 'D:\general\ML_DL\datasets\CIFAR'
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach1 Step1 done')
dataloader = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach1 Step2 done')
images, labels = iter(dataloader).next()
print ('Approach1 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())
#trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach2 Step1 done')
dataloader = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach2 Step2 done')
images, labels = iter(dataloader).next()
print ('Approach2 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())
以上代码的输出是
Approach1 Step1 done
Approach1 Step2 done
Approach1 Step3 done
tensor([[0.2314, 0.1686, 0.1961, ..., 0.6196, 0.5961, 0.5804],
[0.0627, 0.0000, 0.0706, ..., 0.4824, 0.4667, 0.4784],
[0.0980, 0.0627, 0.1922, ..., 0.4627, 0.4706, 0.4275],
...,
[0.8157, 0.7882, 0.7765, ..., 0.6275, 0.2196, 0.2078],
[0.7059, 0.6784, 0.7294, ..., 0.7216, 0.3804, 0.3255],
[0.6941, 0.6588, 0.7020, ..., 0.8471, 0.5922, 0.4824]])
tensor(1.)
tensor(0.)
tensor(0.4057)
Approach2 Step1 done
Approach2 Step2 done
__call__ inside Normalization is called
Approach2 Step3 done
tensor([[-0.5373, -0.6627, -0.6078, ..., 0.2392, 0.1922, 0.1608],
[-0.8745, -1.0000, -0.8588, ..., -0.0353, -0.0667, -0.0431],
[-0.8039, -0.8745, -0.6157, ..., -0.0745, -0.0588, -0.1451],
...,
[ 0.6314, 0.5765, 0.5529, ..., 0.2549, -0.5608, -0.5843],
[ 0.4118, 0.3569, 0.4588, ..., 0.4431, -0.2392, -0.3490],
[ 0.3882, 0.3176, 0.4039, ..., 0.6941, 0.1843, -0.0353]])
tensor(1.)
tensor(-1.)
tensor(-0.1886)
我正在尝试使用 Pytorch 编写新代码。在这段代码中,为了加载数据集 (CIFAR10),我使用了 torchvision 的数据集。我定义了两个转换函数 ToTensor() 和 Normalize()。规范化后,我希望数据集中的数据应该在 0 和 1 之间。但最大值仍然是 255。我还在 transforms.py(Lib\site-packages\torchvision\transforms\transforms.py)。 运行 代码也不会打印此打印件。不确定发生了什么。我在互联网上访问的每个页面,都提到了与我几乎相同的用法。例如我访问过的一些网站 https://github.com/adventuresinML/adventures-in-ml-code/blob/master/pytorch_nn.py https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py
我的代码如下。这会读取使用和不使用 Normalize 的数据集,然后打印一些统计数据。打印的最小值和最大值是数据是否标准化的指标。
import torchvision as tv
import numpy as np
dataDir = 'D:\general\ML_DL\datasets\CIFAR'
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)
输出看起来像,
[ 0.49139968 0.48215841 0.44653091]
0
255
(50000, 32, 32, 3)
[ 0.49139968 0.48215841 0.44653091]
0
255
(50000, 32, 32, 3)
请帮助我更好地理解这一点。由于我尝试过的大多数功能最终都会得到类似的结果——例如灰度、CenterCrop。
因此,您在代码中制定了您希望如何处理数据的计划。您已经创建了一个数据管道,您的数据将通过该管道流动并应用多个转换。
但是,您忘记调用 torch.utils.data.DataLoader
。在调用它之前,不会应用对数据的转换。您可以阅读更多相关信息 here。
现在,当我们将上面的内容添加到您的代码中时,如下所示 -
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(),
tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet = tv.datasets.CIFAR10(root=dataDir, train=True,
download=False, transform=trainTransform)
dataloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=False, num_workers=4)
和如下打印图像 -
images, labels = iter(dataloader).next()
print images
print images.max()
print images.min()
我们得到 Tensors
已应用的转换。
一小段输出
[[ 1.8649, 1.8198, 1.8348, ..., 0.3924, 0.3774, 0.2572],
[ 1.9701, 1.9550, 1.9851, ..., 0.7230, 0.6929, 0.6629],
[ 2.0001, 1.9550, 2.0001, ..., 0.7831, 0.7530, 0.7079],
...,
[-0.8096, -1.0049, -1.0350, ..., -1.3355, -1.3655, -1.4256],
[-0.7796, -0.8697, -0.9749, ..., -1.2754, -1.4557, -1.5609],
[-0.7645, -0.7946, -0.9298, ..., -1.4106, -1.5308, -1.5909]]]])
tensor(2.1309)
tensor(-1.9895)
其次,transforms.Normalize(mean,std)
适用input[channel] = (input[channel] - mean[channel]) / std[channel]
,所以根据我们提供的均值和标准差,我们无法得到(0,1)
范围内转换后的值。如果您想要 (-1,1)
之间的值,您可以使用以下 -
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(),
tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
希望对您有所帮助! :)
看起来在没有归一化的情况下读取并转换为张量本身时,它们会自动在 0 到 1 的范围内归一化。当我们应用规范化时,它会应用您在此数据上提到的公式,范围从 0 到 1。下面是修改后的工作代码,其中包含一些打印语句,显示何时调用 Normalize class 中的“__call__”函数,以及显示值如何 normalized.The 第一个值是 0.2314。用 0.5 归一化使其 (0.2314-0.5)/0.5 = -0.5372。张量值的第一次打印和第二次打印显示了这一点。
代码
import torchvision as tv
import numpy as np
import torch.utils.data as data
dataDir = 'D:\general\ML_DL\datasets\CIFAR'
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach1 Step1 done')
dataloader = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach1 Step2 done')
images, labels = iter(dataloader).next()
print ('Approach1 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())
#trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainSet = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print ('Approach2 Step1 done')
dataloader = data.DataLoader(trainSet, batch_size=1, shuffle=False, num_workers=0)
print ('Approach2 Step2 done')
images, labels = iter(dataloader).next()
print ('Approach2 Step3 done')
print (images[0,0])
print (images.max())
print (images.min())
print (images.mean())
以上代码的输出是
Approach1 Step1 done
Approach1 Step2 done
Approach1 Step3 done
tensor([[0.2314, 0.1686, 0.1961, ..., 0.6196, 0.5961, 0.5804],
[0.0627, 0.0000, 0.0706, ..., 0.4824, 0.4667, 0.4784],
[0.0980, 0.0627, 0.1922, ..., 0.4627, 0.4706, 0.4275],
...,
[0.8157, 0.7882, 0.7765, ..., 0.6275, 0.2196, 0.2078],
[0.7059, 0.6784, 0.7294, ..., 0.7216, 0.3804, 0.3255],
[0.6941, 0.6588, 0.7020, ..., 0.8471, 0.5922, 0.4824]])
tensor(1.)
tensor(0.)
tensor(0.4057)
Approach2 Step1 done
Approach2 Step2 done
__call__ inside Normalization is called
Approach2 Step3 done
tensor([[-0.5373, -0.6627, -0.6078, ..., 0.2392, 0.1922, 0.1608],
[-0.8745, -1.0000, -0.8588, ..., -0.0353, -0.0667, -0.0431],
[-0.8039, -0.8745, -0.6157, ..., -0.0745, -0.0588, -0.1451],
...,
[ 0.6314, 0.5765, 0.5529, ..., 0.2549, -0.5608, -0.5843],
[ 0.4118, 0.3569, 0.4588, ..., 0.4431, -0.2392, -0.3490],
[ 0.3882, 0.3176, 0.4039, ..., 0.6941, 0.1843, -0.0353]])
tensor(1.)
tensor(-1.)
tensor(-0.1886)