为什么张量大小没有改变?
Why tensor size was not changed?
我制作了玩具 CNN 模型。
class Test(nn.Module):
def __init__(self):
super(Test, self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(3,300,3),
nn.Conv2d(300,500,3),
nn.Conv2d(500,1000,3),
)
self.fc = nn.Linear(3364000,1)
def forward(self, x):
out = self.conv(x)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
然后,我通过这个代码检查了model.summary
model = Test()
model.to('cuda')
for param in model.parameters():
print(param.dtype)
break
summary_(model, (3,64,64))
我得到了以下结果:
torch.float32
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 300, 62, 62] 8,400
Conv2d-2 [-1, 500, 60, 60] 1,350,500
Conv2d-3 [-1, 1000, 58, 58] 4,501,000
Linear-4 [-1, 1] 3,364,001
================================================================
Total params: 9,223,901
Trainable params: 9,223,901
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 48.20
Params size (MB): 35.19
Estimated Total Size (MB): 83.43
----------------------------------------------------------------
我想减小模型大小,因为我想增加批量大小。
所以,我通过 NVIDIA/apex
更改了 torch.float32
-> torch.float16
model = Test()
model.to('cuda')
opt_level = 'O3'
optimizer = optim.Adam(model.parameters(), lr=0.001)
model, optimizer = amp.initialize(model, optimizer, opt_level=opt_level)
for param in model.parameters():
print(param.dtype)
break
summary_(model, (3,64,64))
Selected optimization level O3: Pure FP16 training.
Defaults for this optimization level are:
enabled : True
opt_level : O3
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : False
master_weights : False
loss_scale : 1.0
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O3
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : False
master_weights : False
loss_scale : 1.0
torch.float16
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 300, 62, 62] 8,400
Conv2d-2 [-1, 500, 60, 60] 1,350,500
Conv2d-3 [-1, 1000, 58, 58] 4,501,000
Linear-4 [-1, 1] 3,364,001
================================================================
Total params: 9,223,901
Trainable params: 9,223,901
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 48.20
Params size (MB): 35.19
Estimated Total Size (MB): 83.43
----------------------------------------------------------------
因此,torch.dtype
从 torch.float32
更改为 torch.float16
。
但是,Param size (MB): 35.19
没有改变。
为什么会这样?请告诉我这件事。
谢谢。
混合精度并不意味着您的模型变成了原始大小的一半。默认情况下,参数保留在 float32
dtype 中,并在神经网络训练的某些操作期间自动转换为 float16
。这也适用于输入数据。
torch.cuda.amp
提供了在某些训练操作(如卷积)期间执行从 float32
到 float16
的自动转换的功能。您的模型尺寸将保持不变。减小模型大小称为 quantization
,它不同于混合精度训练。
您可以在 NVIDIA's blog and Pytorch's blog 阅读更多关于混合精度训练的信息。
我制作了玩具 CNN 模型。
class Test(nn.Module):
def __init__(self):
super(Test, self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(3,300,3),
nn.Conv2d(300,500,3),
nn.Conv2d(500,1000,3),
)
self.fc = nn.Linear(3364000,1)
def forward(self, x):
out = self.conv(x)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
然后,我通过这个代码检查了model.summary
model = Test()
model.to('cuda')
for param in model.parameters():
print(param.dtype)
break
summary_(model, (3,64,64))
我得到了以下结果:
torch.float32
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 300, 62, 62] 8,400
Conv2d-2 [-1, 500, 60, 60] 1,350,500
Conv2d-3 [-1, 1000, 58, 58] 4,501,000
Linear-4 [-1, 1] 3,364,001
================================================================
Total params: 9,223,901
Trainable params: 9,223,901
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 48.20
Params size (MB): 35.19
Estimated Total Size (MB): 83.43
----------------------------------------------------------------
我想减小模型大小,因为我想增加批量大小。
所以,我通过 NVIDIA/apex
torch.float32
-> torch.float16
model = Test()
model.to('cuda')
opt_level = 'O3'
optimizer = optim.Adam(model.parameters(), lr=0.001)
model, optimizer = amp.initialize(model, optimizer, opt_level=opt_level)
for param in model.parameters():
print(param.dtype)
break
summary_(model, (3,64,64))
Selected optimization level O3: Pure FP16 training.
Defaults for this optimization level are:
enabled : True
opt_level : O3
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : False
master_weights : False
loss_scale : 1.0
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O3
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : False
master_weights : False
loss_scale : 1.0
torch.float16
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 300, 62, 62] 8,400
Conv2d-2 [-1, 500, 60, 60] 1,350,500
Conv2d-3 [-1, 1000, 58, 58] 4,501,000
Linear-4 [-1, 1] 3,364,001
================================================================
Total params: 9,223,901
Trainable params: 9,223,901
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.05
Forward/backward pass size (MB): 48.20
Params size (MB): 35.19
Estimated Total Size (MB): 83.43
----------------------------------------------------------------
因此,torch.dtype
从 torch.float32
更改为 torch.float16
。
但是,Param size (MB): 35.19
没有改变。
为什么会这样?请告诉我这件事。
谢谢。
混合精度并不意味着您的模型变成了原始大小的一半。默认情况下,参数保留在 float32
dtype 中,并在神经网络训练的某些操作期间自动转换为 float16
。这也适用于输入数据。
torch.cuda.amp
提供了在某些训练操作(如卷积)期间执行从 float32
到 float16
的自动转换的功能。您的模型尺寸将保持不变。减小模型大小称为 quantization
,它不同于混合精度训练。
您可以在 NVIDIA's blog and Pytorch's blog 阅读更多关于混合精度训练的信息。