在 pytorch lightning 中定义以下两个张量的乘法

Question

我想按以下方式将以下两个张量 x（形状为 (BS, N, C)）和 y（形状为 (BS,1,C)）相乘：

BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
    for j in range(N):
        out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out

以这种方式实现会产生错误“RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (在方法 wrapper_native_layer_norm)"
中检查参数权重时
设置修复时

out = torch.zeros(size=x.shape).to('cuda)

那么训练就需要很长时间，因为我的 for 循环不是并行执行的。

所以我的问题是如何用pytorch-lightning方式实现上面的两个for循环，这样我就可以定义函数x = multiply_as_above(x, y)并在feedword(self)中使用它) 我的神经网络的方法。顺便说一句，上面定义的操作在我看来就像是内核大小为 1 的卷积。也许我可以使用它？

Answer 1

x*y有什么问题吗？正如您在下面的代码中看到的，它产生与您的函数完全相同的输出：

import torch

torch.manual_seed(2021)

BS = 2
N = 3
C = 4

x = torch.rand(BS, N, C)
y = torch.rand(BS, 1, C)

# your function
def f(x, y):
    BS = x.shape[0]
    N = x.shape[1]
    out = torch.zeros(size=x.shape)
    for i in range(BS):
        for j in range(N):
            out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
    return out

out1 = f(x, y)
out2 = x*y

# comparing the outputs, we can see that they are identical
torch.all(out1 == out2)
# > tensor(True)

在 pytorch lightning 中定义以下两个张量的乘法

Define following multiplication of two tensors in pytorch lightning

python

pytorch

pytorch-lightning