如何在我的代码中检查pytorch中每一层的输出梯度?
How to check the output gradient by each layer in pytorch in my code?
我正在做pytorch来学习
而且我的代码中有一个问题,如何检查每一层的输出梯度。
我的代码如下
#import the nescessary libs
import numpy as np
import torch
import time
# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms
# Get GPU Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)
# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()
# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1)
)
model.to(device)
# Define the loss
criterion = nn.CrossEntropyLoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
# start = time.time()
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
# print(loss.grad)
optimizer.step()
running_loss += loss.item()
else:
print(model[0].grad)
如果我在反向传播后打印 model[0].grad,它会是每个 epoches 的每一层的输出梯度吗?
或者,如果我想知道每一层的输出梯度,我应该在哪里打印什么?
谢谢!!
感谢阅读
好吧,如果您需要了解模型中的内部计算,这是一个很好的问题。我来给你解释一下!
所以首先当你打印 model
变量时你会得到这个输出:
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=10, bias=True)
(3): LogSoftmax(dim=1)
)
而如果你选择model[0]
,则表示你选择了模型的第一层。即 Linear(in_features=784, out_features=128, bias=True)
。如果您查看 torch.nn.Linear
here 的文档,您会发现可以访问此 class 的两个变量。一个是 Linear.weight,另一个是 Linear.bias,这将分别为您提供相应层的权重和偏差。
请记住,您不能使用 model.weight
查看模型的权重,因为您的线性层保存在名为 nn.Sequential
的容器中,该容器没有 weight
属性。
因此回到查看权重和偏差,您可以按层访问它们。所以 model[0].weight
和 model[0].bias
是第一层的权重和偏差。与访问第一层的梯度类似,model[0].weight.grad
和 model[0].bias.grad
将是梯度。
我正在做pytorch来学习
而且我的代码中有一个问题,如何检查每一层的输出梯度。
我的代码如下
#import the nescessary libs
import numpy as np
import torch
import time
# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms
# Get GPU Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)
# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()
# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1)
)
model.to(device)
# Define the loss
criterion = nn.CrossEntropyLoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
# start = time.time()
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
# print(loss.grad)
optimizer.step()
running_loss += loss.item()
else:
print(model[0].grad)
如果我在反向传播后打印 model[0].grad,它会是每个 epoches 的每一层的输出梯度吗?
或者,如果我想知道每一层的输出梯度,我应该在哪里打印什么?
谢谢!!
感谢阅读
好吧,如果您需要了解模型中的内部计算,这是一个很好的问题。我来给你解释一下!
所以首先当你打印 model
变量时你会得到这个输出:
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=10, bias=True)
(3): LogSoftmax(dim=1)
)
而如果你选择model[0]
,则表示你选择了模型的第一层。即 Linear(in_features=784, out_features=128, bias=True)
。如果您查看 torch.nn.Linear
here 的文档,您会发现可以访问此 class 的两个变量。一个是 Linear.weight,另一个是 Linear.bias,这将分别为您提供相应层的权重和偏差。
请记住,您不能使用 model.weight
查看模型的权重,因为您的线性层保存在名为 nn.Sequential
的容器中,该容器没有 weight
属性。
因此回到查看权重和偏差,您可以按层访问它们。所以 model[0].weight
和 model[0].bias
是第一层的权重和偏差。与访问第一层的梯度类似,model[0].weight.grad
和 model[0].bias.grad
将是梯度。