为什么在 'with torch.no_grad():' 中包含 'loss.backward()' 后反向传播过程仍然有效?
Why the backpropagation process can still work when I included 'loss.backward()' in 'with torch.no_grad():'?
我正在 PyTorch 中处理线性回归示例。我知道我在 'with torch.no_grad():' 中包含 'loss.backward()' 时做错了,但为什么它与我的代码配合得很好?
根据pytorch docs,torch.autograd.no_grad
是一个禁用梯度计算的上下文管理器。所以我真的很困惑。
代码在这里:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
# Toy dataset
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]], dtype=np.float32)
y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]], dtype=np.float32)
input_size = 1
output_size = 1
epochs = 100
learning_rate = 0.05
model = nn.Linear(input_size, output_size)
criterion = nn.MSELoss(reduction='sum')
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# training
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
predicted = model(torch.from_numpy(x_train)).detach().numpy()
plt.plot(x_train, y_train, 'ro', label='Original data')
plt.plot(x_train, predicted, label='Fitted line')
plt.legend()
plt.show()
# Save the model checkpoint
torch.save(model.state_dict(), 'model\linear_model.ckpt')
预先感谢您回答我的问题。
这是有效的,因为损失计算发生在 no_grad
之前,并且您根据该损失计算(启用了梯度的计算)继续计算梯度。
基本上,您使用 no_grad
之外计算的梯度继续更新层的权重。
当你实际使用no_grad
时:
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
with torch.no_grad(): # no_grad used here
# forward
out = model(inputs)
loss = criterion(out, targets)
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
然后你会得到正确的错误,说:
element 0 of tensors does not require grad and does not have a grad_fn
.
也就是你用no_grad
不合适的地方。
如果你打印loss的.requires_grad
,那么你会看到loss有requires_grad
。
也就是说,当你这样做时:
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
print('loss grad : ', loss.requires_grad) # Prints loss.require_rgad
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
您将看到:
inputs grad : False
loss grad : True
此外,
print('inputs grad : ', inputs.requires_grad)
将始终打印 False
。也就是说,如果你这样做
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
print('inputs grad : ', inputs.requires_grad). # Print the inputs.requires_grad
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
print('loss grad : ', loss.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
您将获得:
inputs grad : False
inputs grad : False
loss grad : True
就是你在用错误的东西来检查你做错了什么。你能做的最好的事情就是再次阅读 PyTorch 的梯度力学文档。
我正在 PyTorch 中处理线性回归示例。我知道我在 'with torch.no_grad():' 中包含 'loss.backward()' 时做错了,但为什么它与我的代码配合得很好?
根据pytorch docs,torch.autograd.no_grad
是一个禁用梯度计算的上下文管理器。所以我真的很困惑。
代码在这里:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
# Toy dataset
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]], dtype=np.float32)
y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]], dtype=np.float32)
input_size = 1
output_size = 1
epochs = 100
learning_rate = 0.05
model = nn.Linear(input_size, output_size)
criterion = nn.MSELoss(reduction='sum')
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# training
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
predicted = model(torch.from_numpy(x_train)).detach().numpy()
plt.plot(x_train, y_train, 'ro', label='Original data')
plt.plot(x_train, predicted, label='Fitted line')
plt.legend()
plt.show()
# Save the model checkpoint
torch.save(model.state_dict(), 'model\linear_model.ckpt')
预先感谢您回答我的问题。
这是有效的,因为损失计算发生在 no_grad
之前,并且您根据该损失计算(启用了梯度的计算)继续计算梯度。
基本上,您使用 no_grad
之外计算的梯度继续更新层的权重。
当你实际使用no_grad
时:
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
with torch.no_grad(): # no_grad used here
# forward
out = model(inputs)
loss = criterion(out, targets)
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
然后你会得到正确的错误,说:
element 0 of tensors does not require grad and does not have a grad_fn
.
也就是你用no_grad
不合适的地方。
如果你打印loss的.requires_grad
,那么你会看到loss有requires_grad
。
也就是说,当你这样做时:
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
print('loss grad : ', loss.requires_grad) # Prints loss.require_rgad
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
您将看到:
inputs grad : False
loss grad : True
此外,
print('inputs grad : ', inputs.requires_grad)
将始终打印 False
。也就是说,如果你这样做
for epoch in range(epochs):
# convert numpy to tensor
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
print('inputs grad : ', inputs.requires_grad). # Print the inputs.requires_grad
# forward
out = model(inputs)
loss = criterion(out, targets)
# backward
with torch.no_grad():
model.zero_grad()
loss.backward()
optimizer.step()
print('inputs grad : ', inputs.requires_grad)
print('loss grad : ', loss.requires_grad)
if epoch % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))
您将获得:
inputs grad : False
inputs grad : False
loss grad : True
就是你在用错误的东西来检查你做错了什么。你能做的最好的事情就是再次阅读 PyTorch 的梯度力学文档。