如何在我的代码中检查pytorch中每一层的输出梯度?

How to check the output gradient by each layer in pytorch in my code?

我正在做pytorch来学习

而且我的代码中有一个问题,如何检查每一层的输出梯度。

我的代码如下

#import the nescessary libs
import numpy as np
import torch
import time

# Loading the Fashion-MNIST dataset
from torchvision import datasets, transforms

# Get GPU Device

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                    transforms.Normalize((0.5,), (0.5,))
                                                                   ])
# Download and load the training data
trainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)
testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)

# Examine a sample
dataiter = iter(trainloader)
images, labels = dataiter.next()

# Define the network architecture
from torch import nn, optim
import torch.nn.functional as F

model = nn.Sequential(nn.Linear(784, 128),
                      nn.ReLU(),
                      nn.Linear(128, 10),
                      nn.LogSoftmax(dim = 1)
                     )
model.to(device)

# Define the loss
criterion = nn.CrossEntropyLoss()

# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)

# Define the epochs
epochs = 5

train_losses, test_losses = [], []

# start = time.time()
for e in range(epochs):
    running_loss = 0
    
    for images, labels in trainloader:
    # Flatten Fashion-MNIST images into a 784 long vector
        images = images.to(device)
        labels = labels.to(device)
        images = images.view(images.shape[0], -1)
        

    # Training pass
        optimizer.zero_grad()
    
        output = model.forward(images)
        loss = criterion(output, labels)
        
        loss.backward()
        
#         print(loss.grad)
        
        optimizer.step()

        running_loss += loss.item()
    
    else:
        print(model[0].grad)

如果我在反向传播后打印 model[0].grad,它会是每个 epoches 的每一层的输出梯度吗?

或者,如果我想知道每一层的输出梯度,我应该在哪里打印什么?

谢谢!!

感谢阅读

好吧,如果您需要了解模型中的内部计算,这是一个很好的问题。我来给你解释一下!

所以首先当你打印 model 变量时你会得到这个输出:

Sequential(
  (0): Linear(in_features=784, out_features=128, bias=True)
  (1): ReLU()
  (2): Linear(in_features=128, out_features=10, bias=True)
  (3): LogSoftmax(dim=1)
)

而如果你选择model[0],则表示你选择了模型的第一层。即 Linear(in_features=784, out_features=128, bias=True)。如果您查看 torch.nn.Linear here 的文档,您会发现可以访问此 class 的两个变量。一个是 Linear.weight,另一个是 Linear.bias,这将分别为您提供相应层的权重和偏差。

请记住,您不能使用 model.weight 查看模型的权重,因为您的线性层保存在名为 nn.Sequential 的容器中,该容器没有 weight 属性。

因此回到查看权重和偏差,您可以按层访问它们。所以 model[0].weightmodel[0].bias 是第一层的权重和偏差。与访问第一层的梯度类似,model[0].weight.gradmodel[0].bias.grad 将是梯度。