PyTorch：在张量的单维上应用映射

Question

恐怕标题不是很详细，但我想不出更好的标题。基本上我的问题如下：

我有一个形状为 (n, 1, h, w) 的 pytorch 张量，用于任意整数 n、h 和 w（在我的特定情况下，这个数组代表一批灰度图像维度 h x w).

我还有另一个形状为 (m, 2) 的张量，它将第一个数组中的每个可能值（即第一个数组可以包含从 0 到 m - 1 的值）映射到某个元组的价值。我想 "apply" 这个映射到第一个数组，这样我就可以获得一个形状为 (n, 2, h, w).

的数组

我希望这有点清楚，我发现这很难用语言表达，这是一个代码示例（但请注意，由于涉及四维数组，这也不是非常直观）：

import torch

m = 18

# could also be arbitrary tensor with this shape with values between 0 and m - 1
a = torch.arange(m).reshape(2, 1, 3, 3)

# could also be arbitrary tensor with this shape
b = torch.LongTensor(
    [[11, 17, 9, 6, 5, 4, 2, 10, 3, 13, 14, 12, 7, 1, 15, 16, 8, 0],
     [11, 8, 4, 14, 13, 12, 16, 1, 5, 17, 0, 10, 7, 15, 9, 6, 2, 3]]).t()

# I probably have to do this and the permute/reshape, but how?
c = b.index_select(0, a.flatten())

# ...

# another approach that I think works (but I'm not really sure why, I found this
# more or less by trial and error). I would ideally like to find a 'nicer' way
# of doing this
c = torch.stack([
    b.index_select(0, a_.flatten()).reshape(3, 3, 2).permute(2, 0, 1)
    for a_ in a
])

# the end result should be:
#[[[[11, 17,  9],
#   [ 6,  5,  4],
#   [ 2, 10,  3]],
#
#  [[11,  8,  4],
#   [14, 13, 12],
#   [16,  1,  5]]],
#
#
# [[[13, 14, 12],
#   [ 7,  1, 15],
#   [16,  8,  0]],
#
#  [[17,  0, 10],
#   [ 7, 15,  9],
#   [ 6,  2,  3]]]]

如何高效地执行此转换？（理想情况下不使用任何额外的内存）。在 numpy 中，这可以通过 np.apply_along_axis 轻松实现，但似乎没有与之等效的 pytorch。

Answer 1

这是使用切片、堆叠和基于视图的整形的一种方法：

In [239]: half_way = b.shape[0]//2

In [240]: upper_half = torch.stack((b[:half_way, :][:, 0], b[:half_way, :][:, 1]), dim=0).view(-1, 3, 3)
In [241]: lower_half = torch.stack((b[half_way:, :][:, 0], b[half_way:, :][:, 1]), dim=0).view(-1, 3, 3)

In [242]: torch.stack((upper_half, lower_half))
Out[242]: 
tensor([[[[11, 17,  9],
          [ 6,  5,  4],
          [ 2, 10,  3]],

         [[11,  8,  4],
          [14, 13, 12],
          [16,  1,  5]]],


        [[[13, 14, 12],
          [ 7,  1, 15],
          [16,  8,  0]],

         [[17,  0, 10],
          [ 7, 15,  9],
          [ 6,  2,  3]]]])

一些注意事项是这仅适用于 n=2。但是，这比基于循环的方法快 1.7 倍，但涉及更多代码。

这是一个更通用的方法，它可以扩展到任何正整数n：

In [327]: %%timeit
     ...: block_size = b.shape[0]//a.shape[0]
     ...: seq_of_tensors = [b[block_size*idx:block_size*(idx+1), :].permute(1, 0).flatten().reshape(2, 3, 3).unsqueeze(0)  for idx in range(a.shape[0])]
     ...: torch.cat(seq_of_tensors)
     ...: 
23.5 µs ± 460 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

您也可以使用 view 而不是重塑：

block_size = b.shape[0]//a.shape[0]
seq_of_tensors = [b[block_size*idx:block_size*(idx+1), :].permute(1, 0).flatten().view(2, 3, 3).unsqueeze(0)  for idx in range(a.shape[0])]
torch.cat(seq_of_tensors)
# outputs
tensor([[[[11, 17,  9],
          [ 6,  5,  4],
          [ 2, 10,  3]],

         [[11,  8,  4],
          [14, 13, 12],
          [16,  1,  5]]],


        [[[13, 14, 12],
          [ 7,  1, 15],
          [16,  8,  0]],

         [[17,  0, 10],
          [ 7, 15,  9],
          [ 6,  2,  3]]]])

注意：请注意，我仍然使用列表推导式，因为我们必须将张量 b 平均分配以进行排列、展平、重塑、解压和然后 concatenate/stack 沿维度 0。它仍然比我上面的解决方案快一点。

PyTorch：在张量的单维上应用映射

PyTorch: apply mapping over singleton dimension of tensor

python

numpy

reshape

pytorch

tensor