如何转换神经网络的输出并继续训练?
How to transform output of neural network and still train?
我有一个输出 output
的神经网络。我想在损失和反向传播发生之前转换 output
。
这是我的通用代码:
with torch.set_grad_enabled(training):
outputs = net(x_batch[:, 0], x_batch[:, 1]) # the prediction of the NN
# My issue is here:
outputs = transform_torch(outputs)
loss = my_loss(outputs, y_batch)
if training:
scheduler.step()
loss.backward()
optimizer.step()
我有一个转换函数,我通过它输出:
def transform_torch(predictions):
torch_dimensions = predictions.size()
torch_grad = predictions.grad_fn
cuda0 = torch.device('cuda:0')
new_tensor = torch.ones(torch_dimensions, dtype=torch.float64, device=cuda0, requires_grad=True)
for i in range(int(len(predictions))):
a = predictions[i]
# with torch.no_grad(): # Note: no training happens if this line is kept in
new_tensor[i] = torch.flip(torch.cumsum(torch.flip(a, dims = [0]), dim = 0), dims = [0])
return new_tensor
我的问题是在倒数第二行出现错误:
RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.
有什么建议吗?我已经尝试过使用“with torch.no_grad():”(已评论),但这会导致训练效果很差,而且我相信梯度在转换函数后不能正确反向传播。
谢谢!
关于问题所在的错误是完全正确的 - 当您使用 requires_grad = True
创建新张量时,您在图中创建了一个叶节点(就像模型的参数一样)并且不允许这样做就地操作。
解决方法很简单,不需要提前创建new_tensor
。它不应该是叶节点;即时创建它
new_tensor = [ ]
for i in range(int(len(predictions))):
a = predictions[i]
new_tensor.append(torch.flip(torch.cumsum(torch.flip(a, ...), ...), ...))
new_tensor = torch.stack(new_tensor, 0)
此 new_tensor
将从 predictions
继承所有属性,例如 dtype
、device
,并且已经具有 require_grad = True
。
我有一个输出 output
的神经网络。我想在损失和反向传播发生之前转换 output
。
这是我的通用代码:
with torch.set_grad_enabled(training):
outputs = net(x_batch[:, 0], x_batch[:, 1]) # the prediction of the NN
# My issue is here:
outputs = transform_torch(outputs)
loss = my_loss(outputs, y_batch)
if training:
scheduler.step()
loss.backward()
optimizer.step()
我有一个转换函数,我通过它输出:
def transform_torch(predictions):
torch_dimensions = predictions.size()
torch_grad = predictions.grad_fn
cuda0 = torch.device('cuda:0')
new_tensor = torch.ones(torch_dimensions, dtype=torch.float64, device=cuda0, requires_grad=True)
for i in range(int(len(predictions))):
a = predictions[i]
# with torch.no_grad(): # Note: no training happens if this line is kept in
new_tensor[i] = torch.flip(torch.cumsum(torch.flip(a, dims = [0]), dim = 0), dims = [0])
return new_tensor
我的问题是在倒数第二行出现错误:
RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.
有什么建议吗?我已经尝试过使用“with torch.no_grad():”(已评论),但这会导致训练效果很差,而且我相信梯度在转换函数后不能正确反向传播。
谢谢!
关于问题所在的错误是完全正确的 - 当您使用 requires_grad = True
创建新张量时,您在图中创建了一个叶节点(就像模型的参数一样)并且不允许这样做就地操作。
解决方法很简单,不需要提前创建new_tensor
。它不应该是叶节点;即时创建它
new_tensor = [ ]
for i in range(int(len(predictions))):
a = predictions[i]
new_tensor.append(torch.flip(torch.cumsum(torch.flip(a, ...), ...), ...))
new_tensor = torch.stack(new_tensor, 0)
此 new_tensor
将从 predictions
继承所有属性,例如 dtype
、device
,并且已经具有 require_grad = True
。