梯度计算所需的变量之一已被就地操作修改:找不到就地操作
one of the variables needed for gradient computation has been modified by an inplace operation : can't find inplace operation
I have this code below and I can't find the inplace operation that prevents the gradient from computing.
for epoch in range(nepoch):
model.train()
scheduler.step()
for batch1 in loader1:
torch.ones(len(batch1[0]), dtype=torch.float)
x, label = batch1
x = x1.to('cuda', non_blocking=True)
optimizer.zero_grad()
pred = model(x)
pred = pred.squeeze() if pred.ndimension() > 1 else pred
label = (label.float()).cuda(cuda0)
weights = torch.ones(len(label))
loss_fun = torch.nn.BCEWithLogitsLoss(weight=weights.cuda(cuda0))
score = loss_fun(pred, label)
label = np.array(np.round(label.cpu().detach())).astype(bool)
pred = np.array(pred.cpu().detach()>0).astype(bool)
torch.autograd.set_detect_anomaly(True)
score.backward()
optimizer.step()
最后我弹出这个错误:
Warning: Error detected in MulBackward0. Traceback of forward call that caused the error:
File "train.py", line 98, in <module>
pred = model(x)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
(print_stack at /pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:60)
Traceback (most recent call last):
File "train.py", line 116, in <module>
score.backward()
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 1, 80]], which is output 0 of SelectBackward, is at version 378; expected version 377 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
如果你能帮助我,那就太好了!
原地操作好像是在这一行:
File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
请注意,它使用来自 filtered[i] 的值,然后将结果存储在 filtered[i] 中。这就是就地的意思;新值覆盖旧值。
要修复它,您需要执行以下操作:
filtered_new = torch.zeros_like(filtered)
...
filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
使这有点复杂的部分是它似乎在一个循环内(我假设 i
是循环计数器)并且它可能使用上一次通过循环的值。修改后的版本不在原地,但可能也不会产生与原始版本相同的结果。所以你可能需要做这样的事情:
filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered_new[i-1]
如果不查看更多相关代码就无法解决此问题,但基本上 - 环顾四周,将任何更改现有张量的操作替换为创建新张量以存储计算结果的操作。
I have this code below and I can't find the inplace operation that prevents the gradient from computing.
for epoch in range(nepoch):
model.train()
scheduler.step()
for batch1 in loader1:
torch.ones(len(batch1[0]), dtype=torch.float)
x, label = batch1
x = x1.to('cuda', non_blocking=True)
optimizer.zero_grad()
pred = model(x)
pred = pred.squeeze() if pred.ndimension() > 1 else pred
label = (label.float()).cuda(cuda0)
weights = torch.ones(len(label))
loss_fun = torch.nn.BCEWithLogitsLoss(weight=weights.cuda(cuda0))
score = loss_fun(pred, label)
label = np.array(np.round(label.cpu().detach())).astype(bool)
pred = np.array(pred.cpu().detach()>0).astype(bool)
torch.autograd.set_detect_anomaly(True)
score.backward()
optimizer.step()
最后我弹出这个错误:
Warning: Error detected in MulBackward0. Traceback of forward call that caused the error:
File "train.py", line 98, in <module>
pred = model(x)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
(print_stack at /pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:60)
Traceback (most recent call last):
File "train.py", line 116, in <module>
score.backward()
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/anatole2/miniconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 1, 80]], which is output 0 of SelectBackward, is at version 378; expected version 377 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
如果你能帮助我,那就太好了!
原地操作好像是在这一行:
File "/home/anatole2/best/PCEN_pytorch.py", line 30, in forward
filtered[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
请注意,它使用来自 filtered[i] 的值,然后将结果存储在 filtered[i] 中。这就是就地的意思;新值覆盖旧值。
要修复它,您需要执行以下操作:
filtered_new = torch.zeros_like(filtered)
...
filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered[i-1]
使这有点复杂的部分是它似乎在一个循环内(我假设 i
是循环计数器)并且它可能使用上一次通过循环的值。修改后的版本不在原地,但可能也不会产生与原始版本相同的结果。所以你可能需要做这样的事情:
filtered_new[i] = filtered[i] + (1-exp(self.log_s)) * filtered_new[i-1]
如果不查看更多相关代码就无法解决此问题,但基本上 - 环顾四周,将任何更改现有张量的操作替换为创建新张量以存储计算结果的操作。