pytorch 中图像块的可微分仿射变换

Question

我有一个对象边界框的张量，例如具有 [10,4] 的形状，对应于一批图像，例如具有形状 [2,3,64,64] 和每个具有形状 [10,6] 的对象的变换矩阵以及定义哪个对象索引属于哪个图像的向量。我想对图像的补丁应用仿射变换，并在应用变换后替换这些补丁。我现在正在使用 for 循环执行此操作，但我执行此操作的方式不可区分（我从 pytorch 获得了就地操作错误）。我想知道是否有一种可区分的方法来做到这一点。例如通过 grid_sample?

这是我当前的代码：

for obj_num in range(obj_vecs.shape[0]): #batch_size
    im_id = obj_to_img[obj_num]
    x1, y1, x2, y2 = boxes_pred[obj_num]
    im_patch = img[im_id, :, x1:x2, y1:y2]
    im_patch = im_patch[None, :, :, :]
    img[im_id, :, x1:x2, y1:y2] = self.VITAE.stn(im_patch, theta_mean[obj_num], inverse=False)[0]

Answer 1

有几种方法可以在 PyTorch 中执行可微裁剪。

让我们举一个二维的最小例子：

>>> x1, y1, x2, y2 = torch.randint(0, 9, (4,))
(tensor(7), tensor(3), tensor(5), tensor(6))

>>> x = torch.randint(0, 100, (9,9), dtype=float, requires_grad=True)
tensor([[18., 34., 28., 41.,  1., 14., 77., 75., 23.],
        [62., 33., 64., 41., 16., 70., 47., 45., 19.],
        [20., 69.,  5., 51.,  1., 16., 20., 63., 52.],
        [51., 25.,  8., 30., 40., 67., 41., 27., 33.],
        [36.,  6., 95., 53., 69., 84., 51., 42., 71.],
        [46., 72., 88., 82., 71., 75., 86., 36., 15.],
        [66., 19., 58., 50., 91., 28.,  7., 83.,  4.],
        [94., 50., 34., 34., 92., 45., 48., 97., 76.],
        [80., 34., 19., 13., 77., 77., 51., 15., 13.]], dtype=torch.float64,
       requires_grad=True)

给定 x1、x2（resp. y1、y2 高度维度（resp. 宽度维度）上的补丁索引边界。您可以得到对应的坐标网格你使用 torch.arange and torch.meshgrid:

的组合来修补吗

>>> sorted_range = lambda a, b: torch.arange(a, b) if b >= a else torch.arange(b, a)
>>> xi, yi = sorted_range(x1, x2), sorted_range(y1, y2)
(tensor([3, 4, 5, 6]), tensor([5]))

>>> i, j = torch.meshgrid(xi, yi)
(tensor([[3],
         [4],
         [5],
         [6]]), 
 tensor([[5],
         [5],
         [5],
         [5]]))

使用该设置，您可以提取和替换 x.

的补丁

直接索引x即可提取补丁:

>>> patch = x[i, j].reshape(len(xi), len(yi))
tensor([[67.],
        [84.],
        [75.],
        [28.]], dtype=torch.float64, grad_fn=<ViewBackward>)

这是用于说明目的的掩码：

tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=torch.float64,
grad_fn=<IndexPutBackward>)

您可以使用 torch.Tensor.index_put:

将 x 中的值替换为对补丁进行某些转换的结果

>>> values = 2*patch
 tensor([[134.],
         [168.],
         [150.],
         [ 56.]], dtype=torch.float64, grad_fn=<MulBackward0>)

>>> x.index_put(indices=(i, j), values=values)
tensor([[ 18.,  34.,  28.,  41.,   1.,  14.,  77.,  75.,  23.],
        [ 62.,  33.,  64.,  41.,  16.,  70.,  47.,  45.,  19.],
        [ 20.,  69.,   5.,  51.,   1.,  16.,  20.,  63.,  52.],
        [ 51.,  25.,   8.,  30.,  40., 134.,  41.,  27.,  33.],
        [ 36.,   6.,  95.,  53.,  69., 168.,  51.,  42.,  71.],
        [ 46.,  72.,  88.,  82.,  71., 150.,  86.,  36.,  15.],
        [ 66.,  19.,  58.,  50.,  91.,  56.,   7.,  83.,   4.],
        [ 94.,  50.,  34.,  34.,  92.,  45.,  48.,  97.,  76.],
        [ 80.,  34.,  19.,  13.,  77.,  77.,  51.,  15.,  13.]],
    dtype=torch.float64, grad_fn=<IndexPutBackward>)

pytorch 中图像块的可微分仿射变换

Differentiable affine transformation on patches of images in pytorch

python

affinetransform

pytorch