在 pytorch lightning 中定义以下两个张量的乘法

Define following multiplication of two tensors in pytorch lightning

我想按以下方式将以下两个张量 x(形状为 (BS, N, C))和 y(形状为 (BS,1,C))相乘:

BS = x.shape[0]
N = x.shape[1]
out = torch.zeros(size=x.shape)
for i in range(BS):
    for j in range(N):
        out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
return out
  1. 以这种方式实现会产生错误“RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (在方法 wrapper_native_layer_norm)"

    中检查参数权重时
  2. 设置修复时

    out = torch.zeros(size=x.shape).to('cuda)

那么训练就需要很长时间,因为我的 for 循环不是并行执行的。

所以我的问题是如何用pytorch-lightning方式实现上面的两个for循环,这样我就可以定义函数x = multiply_as_above(x, y)并在feedword(self)中使用它) 我的神经网络的方法。 顺便说一句,上面定义的操作在我看来就像是内核大小为 1 的卷积。也许我可以使用它?

x*y有什么问题吗?正如您在下面的代码中看到的,它产生与您的函数完全相同的输出:

import torch

torch.manual_seed(2021)

BS = 2
N = 3
C = 4

x = torch.rand(BS, N, C)
y = torch.rand(BS, 1, C)

# your function
def f(x, y):
    BS = x.shape[0]
    N = x.shape[1]
    out = torch.zeros(size=x.shape)
    for i in range(BS):
        for j in range(N):
            out[i, j, :] = torch.mul(x[i, j, :], y[i, 0, :])
    return out

out1 = f(x, y)
out2 = x*y

# comparing the outputs, we can see that they are identical
torch.all(out1 == out2)
# > tensor(True)