在 pytorch lightning 中定义以下两个张量的乘法
Define following multiplication of two tensors in pytorch lightning
我想按以下方式将以下两个张量 x(形状为 (BS, N, C))和 y(形状为 (BS,1,C))相乘:
BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
for j in range(N):
out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out
以这种方式实现会产生错误“RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (在方法 wrapper_native_layer_norm)"
中检查参数权重时
设置修复时
out = torch.zeros(size=x.shape).to('cuda)
那么训练就需要很长时间,因为我的 for 循环不是并行执行的。
所以我的问题是如何用pytorch-lightning方式实现上面的两个for循环,这样我就可以定义函数x = multiply_as_above(x, y)并在feedword(self)中使用它) 我的神经网络的方法。
顺便说一句,上面定义的操作在我看来就像是内核大小为 1 的卷积。也许我可以使用它?
x*y
有什么问题吗?正如您在下面的代码中看到的,它产生与您的函数完全相同的输出:
import torch
torch.manual_seed(2021)
BS = 2
N = 3
C = 4
x = torch.rand(BS, N, C)
y = torch.rand(BS, 1, C)
# your function
def f(x, y):
BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
for j in range(N):
out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out
out1 = f(x, y)
out2 = x*y
# comparing the outputs, we can see that they are identical
torch.all(out1 == out2)
# > tensor(True)
我想按以下方式将以下两个张量 x(形状为 (BS, N, C))和 y(形状为 (BS,1,C))相乘:
BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
for j in range(N):
out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out
以这种方式实现会产生错误“RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (在方法 wrapper_native_layer_norm)"
中检查参数权重时设置修复时
out = torch.zeros(size=x.shape).to('cuda)
那么训练就需要很长时间,因为我的 for 循环不是并行执行的。
所以我的问题是如何用pytorch-lightning方式实现上面的两个for循环,这样我就可以定义函数x = multiply_as_above(x, y)并在feedword(self)中使用它) 我的神经网络的方法。 顺便说一句,上面定义的操作在我看来就像是内核大小为 1 的卷积。也许我可以使用它?
x*y
有什么问题吗?正如您在下面的代码中看到的,它产生与您的函数完全相同的输出:
import torch
torch.manual_seed(2021)
BS = 2
N = 3
C = 4
x = torch.rand(BS, N, C)
y = torch.rand(BS, 1, C)
# your function
def f(x, y):
BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
for j in range(N):
out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out
out1 = f(x, y)
out2 = x*y
# comparing the outputs, we can see that they are identical
torch.all(out1 == out2)
# > tensor(True)