如何得到输出梯度w.r.t输入

How to get the output gradient w.r.t input

我在获取输入的输出梯度时遇到了一些问题。 这是简单的mnist模型。

for num,(sample_img, sample_label) in enumerate(mnist_test):
    if num == 1:
        break

    sample_img = sample_img.to(device)
    sample_img.requires_grad = True
    prediction = model(sample_img.unsqueeze(dim=0))
    cost = criterion(prediction, torch.tensor([sample_label]).to(device))
    optimizer.zero_grad()
    cost.backward()
    print(sample_label)
    print(sample_img.shape)

    plt.imshow(sample_img.detach().cpu().squeeze(),cmap='gray')
    plt.show()

print(sample_img.grad)

sample_img.grad 是 None

如果您需要计算相对于输入的梯度,您可以通过调用 sample_img.requires_grad_() 或设置 sample_img.requires_grad = True 来实现,如评论中所建议。

这是一个小例子:

import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt


model = nn.Sequential(  # a dummy model
    nn.Conv2d(1, 1, 3),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Flatten()
)

sample_img = torch.rand(1, 5, 5)  # a dummy input
sample_label = 0

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-3)
device = "cpu"

sample_img = sample_img.to(device)
sample_img.requires_grad = True

prediction = model(sample_img.unsqueeze(dim=0))
cost = criterion(prediction, torch.tensor([sample_label]).to(device))
optimizer.zero_grad()
cost.backward()
print(sample_label)
print(sample_img.shape)

plt.imshow(sample_img.detach().cpu().squeeze(), cmap='gray')
plt.show()

print(sample_img.grad.shape)
print(sample_img.grad)

另外,如果你不需要模型的渐变,可以将它们的渐变要求关闭:

for param in model.parameters():
    param.requires_grad = False