填充 scipy affine_transform 输出以显示转换图像的非重叠区域

Question

我有源（src）图像我希望使用仿射变换对齐到目标（dst）图像，同时在对齐期间保留两个图像的完整范围（甚至非重叠区域）。

我已经能够计算仿射变换旋转和偏移矩阵，我将其提供给 scipy.ndimage.interpolate.affine_transform 以恢复 dst 对齐的 src 图像。

问题是，当图像没有完全重叠时，生成的图像将被裁剪为只有两个图像的公共足迹。我需要的是两个图像的完整范围，放置在相同的像素坐标系上。这个问题几乎是的重复 - 那里的优秀答案和存储库为 OpenCV 转换提供了这个功能。不幸的是，我需要这个来实现 scipy。

太晚了，在试图将上述问题的答案翻译成 scipy 时屡屡碰壁后，我遇到了 this issue and subsequently followed to this question。后一个问题确实让我对scipy的仿射变换的奇妙世界有所了解，但我至今仍无法破解我的特殊需求。

从 src 到 dst 的转换可以有平移和旋转。我只能让 translations 工作（下面显示了一个示例）并且我只能让 rotations 工作（主要是围绕下面的 hacking 并从中获取灵感再次在 scipy.ndimage.interpolation.rotate). However, I am getting thoroughly lost combining the two. I have tried to calculate what should be the correct offset (see this question's answers 中使用 reshape 参数），但我无法在所有情况下使用它。

填充仿射变换的仅翻译工作示例，主要遵循 this repo, explained in :

from scipy.ndimage import rotate, affine_transform
import numpy as np
import matplotlib.pyplot as plt

nblob = 50
shape = (200, 100)
buffered_shape = (300, 200)  # buffer for rotation and translation


def affine_test(angle=0, translate=(0, 0)):
    np.random.seed(42)
    # Maxiumum translation allowed is half difference between shape and buffered_shape

    # Generate a buffered_shape-sized base image with random blobs
    base = np.zeros(buffered_shape, dtype=np.float32)
    random_locs = np.random.choice(np.arange(2, buffered_shape[0] - 2), nblob * 2, replace=False)
    i = random_locs[:nblob]
    j = random_locs[nblob:]
    for k, (_i, _j) in enumerate(zip(i, j)):
        # Use different values, just to make it easier to distinguish blobs
        base[_i - 2 : _i + 2, _j - 2 : _j + 2] = k + 10

    # Impose a rotation and translation on source
    src = rotate(base, angle, reshape=False, order=1, mode="constant")
    bsc = (np.array(buffered_shape) / 2).astype(int)
    sc = (np.array(shape) / 2).astype(int)
    src = src[
        bsc[0] - sc[0] + translate[0] : bsc[0] + sc[0] + translate[0],
        bsc[1] - sc[1] + translate[1] : bsc[1] + sc[1] + translate[1],
    ]
    # Cut-out destination from the centre of the base image
    dst = base[bsc[0] - sc[0] : bsc[0] + sc[0], bsc[1] - sc[1] : bsc[1] + sc[1]]

    src_y, src_x = src.shape

    def get_matrix_offset(centre, angle, scale):
        """Follows OpenCV.getRotationMatrix2D"""
        angle = angle * np.pi / 180
        alpha = scale * np.cos(angle)
        beta = scale * np.sin(angle)
        return (
            np.array([[alpha, beta], [-beta, alpha]]),
            np.array(
                [
                    (1 - alpha) * centre[0] - beta * centre[1],
                    beta * centre[0] + (1 - alpha) * centre[1],
                ]
            ),
        )
    # Obtain the rotation matrix and offset that describes the transformation
    # between src and dst
    matrix, offset = get_matrix_offset(np.array([src_y / 2, src_x / 2]), angle, 1)
    offset = offset - translate

    # Determine the outer bounds of the new image
    lin_pts = np.array([[0, src_x, src_x, 0], [0, 0, src_y, src_y]])
    transf_lin_pts = np.dot(matrix.T, lin_pts) - offset[::-1].reshape(2, 1)

    # Find min and max bounds of the transformed image
    min_x = np.floor(np.min(transf_lin_pts[0])).astype(int)
    min_y = np.floor(np.min(transf_lin_pts[1])).astype(int)
    max_x = np.ceil(np.max(transf_lin_pts[0])).astype(int)
    max_y = np.ceil(np.max(transf_lin_pts[1])).astype(int)

    # Add translation to the transformation matrix to shift to positive values
    anchor_x, anchor_y = 0, 0
    if min_x < 0:
        anchor_x = -min_x
    if min_y < 0:
        anchor_y = -min_y
    shifted_offset = offset - np.dot(matrix, [anchor_y, anchor_x])

    # Create padded destination image
    dst_h, dst_w = dst.shape[:2]
    pad_widths = [anchor_y, max(max_y, dst_h) - dst_h, anchor_x, max(max_x, dst_w) - dst_w]
    dst_padded = np.pad(
        dst,
        ((pad_widths[0], pad_widths[1]), (pad_widths[2], pad_widths[3])),
        "constant",
        constant_values=-1,
    )
    dst_pad_h, dst_pad_w = dst_padded.shape

    # Create the aligned and padded source image
    source_aligned = affine_transform(
        src,
        matrix.T,
        offset=shifted_offset,
        output_shape=(dst_pad_h, dst_pad_w),
        order=3,
        mode="constant",
        cval=-1,
    )

    # Plot the images
    fig, axes = plt.subplots(1, 4, figsize=(10, 5), sharex=True, sharey=True)
    axes[0].imshow(src, cmap="viridis", vmin=-1, vmax=nblob)
    axes[0].set_title("Source")
    axes[1].imshow(dst, cmap="viridis", vmin=-1, vmax=nblob)
    axes[1].set_title("Dest")
    axes[2].imshow(source_aligned, cmap="viridis", vmin=-1, vmax=nblob)
    axes[2].set_title("Source aligned to Dest padded")
    axes[3].imshow(dst_padded, cmap="viridis", vmin=-1, vmax=nblob)
    axes[3].set_title("Dest padded")
    plt.show()

例如：

affine_test(0, (-20, 40))

给出：

放大显示填充图像中的对齐：

我需要 src 和 dst 图像的完整范围在相同的像素坐标上对齐，同时具有旋转和平移。

非常感谢任何帮助！

Answer 1

如果您有两张相似（或相同）的图像并且您想要对齐它们，您可以同时使用旋转和移位功能来实现：

from scipy.ndimage import rotate, shift

您需要先找到两个图像之间的角度差异 angle_to_rotate，然后应用旋转到 src:

angle_to_rotate = 25
rotated_src = rotate(src, angle_to_rotate , reshape=True, order=1, mode="constant")

使用 reshape=True 可以避免丢失原始 src 矩阵中的信息，它会填充结果，以便图像可以在 0,0 索引周围进行转换。您可以按原样计算此平移 (x*cos(angle),y*sin(angle)，其中 x 和 y 是图像的尺寸，但这可能无关紧要。

现在你需要翻译图像到源，为此你可以使用shift函数:

rot_translated_src = shift(rotated_src , [distance_x, distance_y])

在这种情况下没有整形（因为否则你不会有任何真正的翻译）所以如果图像之前没有填充一些信息将会丢失。

但是你可以用

做一些填充

np.pad(src, number, mode='constant')

要计算distance_x和distance_y，您需要在rotated_src和目的地之间找到一个可以作为参考的点，然后计算x中的距离和 y 轴。

总结

在src中做一些填充，dst
求出它们之间的angular距离。
旋转 src 与 scipy.ndimage.rotate 使用 reshape=True
求旋转图像与dst的水平和垂直距离 distance_x, distance_y
用scipy.ndimage.shift

翻译

代码

from scipy.ndimage import rotate, shift
import matplotlib.pyplot as plt
import numpy as np

首先我们制作目标图像：

# make and plot dest
dst = np.ones([40,20])
dst = np.pad(dst,10)
dst[17,[14,24]]=4
dst[27,14:25]=4
dst[26,[14,25]]=4
rotated_dst = rotate(dst, 20, order=1)

plt.imshow(dst) # plot it
plt.imshow(rotated_dst)
plt.show()

我们制作源图像：

# make_src image and plot it
src = np.zeros([40,20])
src = np.pad(src,10)
src[0:20,0:20]=1
src[7,[4,14]]=4
src[17,4:15]=4
src[16,[4,15]]=4
plt.imshow(src)
plt.show()

然后我们将 src 与目标对齐：

rotated_src = rotate(src, 20, order=1) # find the angle 20, reshape true is by default
plt.imshow(rotated_src)
plt.show()
distance_y = 8 # find this distances from rotated_src and dst
distance_x = 12 # use any visual reference or even the corners
translated_src = shift(rotated_src, [distance_y,distance_x])
plt.imshow(translated_src)
plt.show()

pd：如果您发现以编程方式查找角度和距离时遇到问题，请发表评论，提供更多关于什么可以用作参考可以是例如图像的框架或一些图像特征/数据）

Answer 2

复杂度分析

问题是确定三个参数

假设您有一个用于角度、x 和 y 位移的网格，每个网格的大小为 O(n)，并且您的图像的大小为 O(n x n)，因此，旋转、平移和比较所有图像都采用 O(n^2)，因为您有 O(n^3) 个候选变换要尝试，所以最终会变得复杂 O(n^5)，这可能就是您问这个问题的原因。

然而，通过使用傅立叶变换计算最大相关性，可以稍微更有效地计算位移部分。傅里叶变换可以在每个轴上以复杂度 O(n log n) 执行，我们必须在两个空间维度上执行它们，完整的相关矩阵可以在 O(n^2 log^2 n) 中计算，然后我们找到具有复杂度的最大值 O(n^2)，因此确定最佳对齐的总体时间复杂度为O(n^2 log^2 n)。然而，您仍然想要搜索最佳角度，因为我们有 O(n) 个候选角度，因此此搜索的总体复杂度将为 O(n^3 log^2 n)。请记住，我们正在使用 python 并且我们可能会有一些显着的开销，因此这种复杂性只会让我们知道它会有多困难，而且我以前处理过这样的问题所以我开始有信心了。

准备一些例子

我将首先下载图像并应用旋转并使用零居中图像填充。


def centralized(a, width, height):
    '''
    Image centralized to the given width and height
    by padding with zeros (black)
    '''
    assert width >= a.shape[0] and height >= a.shape[1]
    ap = np.zeros((width, height) + a.shape[2:], a.dtype)
    ccx = (width - a.shape[0])//2
    ccy = (height - a.shape[1])//2
    ap[ccx:ccx+a.shape[0], ccy:ccy+a.shape[1], ...] = a
    return ap
def image_pair(im, width, height, displacement=(0,0), angle=0):
    '''
    this build an a pair of images as numpy arrays
    from the input image.
    Both images will be padded with zeros (black)
    and roughly centralized.
    and will have the specified shape
    
    make sure that the width and height chosen are enough 
    to fit the rotated image
    '''
    a = np.array(im)
    a1 = centralized(a, width, height)
    a2 = centralized(ndimage.rotate(a, angle), width, height)
    a2 = np.roll(a2, displacement, axis=(0,1))
    return a1, a2

def random_transform():
    angle = np.random.rand() * 360
    displacement = np.random.randint(-100, 100, 2)
    return displacement, angle

a1, a2 = image_pair(im, 512, 512, *random_transform())
plt.subplot(121)
plt.imshow(a1)
plt.subplot(122)
plt.imshow(a2)

位移搜索

第一件事是计算图像的相关性

def compute_correlation(a1, a2):
    A1 = np.fft.rfftn(a1, axes=(0,1))
    A2 = np.fft.rfftn(a2, axes=(0,1))
    C = np.fft.irfftn(np.sum(A1 * np.conj(A2), axis=2))
    return C

然后，让我们创建一个没有旋转的示例，并确认使用最大相关性的索引，我们可以找到使一个图像适合另一个图像的位移。

displacement, _ = random_transform()
a1, a2 = image_pair(im, 521, 512, displacement, angle=0)
C = compute_correlation(a1, a2)
np.unravel_index(np.argmax(C), C.shape), displacement
a3 = np.roll(a2, np.unravel_index(np.argmax(C), C.shape), axis=(0,1))
assert np.all(a3 == a1)

通过旋转或插值，此结果可能不准确，但它给出的位移将使我们尽可能接近对齐。

让我们把它放在一个函数中以备将来使用

def get_aligned(a1, a2, angle):
    a1_rotated = ndimage.rotate(a1, angle, reshape=False)
    C = compute_correlation(a2, a1_rotated)
    found_displacement = np.unravel_index(np.argmax(C), C.shape)
    a1_aligned = np.roll(a1_rotated, found_displacement, axis=(0,1))
    return a1_aligned

正在搜索角度

现在我们可以分两步做一些事情，

我们计算每个角度的相关性，然后用给出最大相关性的角度找到对齐方式。

displacement, angle = random_transform()
a1, a2 = image_pair(im, 521, 512, displacement, angle)
C_max = []
C_argmax = []
angle_guesses = np.arange(0, 360, 5)
for angle_guess in angle_guesses:
    a1_rotated = ndimage.rotate(a1, angle_guess, reshape=False)
    C = compute_correlation(a1_rotated, a2)
    i = np.argmax(C)
    v = C.reshape(-1)[i]
    C_max.append(v)
    C_argmax.append(i)

让我们看看相关性如何

plt.plot(angle_guesses, C_max);

看看这条曲线，我们有一个明显的赢家，即使向日葵具有某种旋转对称性。

让我们对原始图像应用变换，看看它是什么样子

a1_aligned = get_aligned(a1, a2, angle_guesses[np.argmax(C_max)])
plt.subplot(121)
plt.imshow(a2)
plt.subplot(122)
plt.imshow(a1_aligned)

太棒了，我没有比手动做得更好的了。

出于美观原因，我使用了向日葵图像，但是对于任何类型的图像，该过程都是相同的。我使用 RGB 显示图像可能有一个额外的维度，即它使用一个特征向量，而不是标量特征，如果你的特征是一个标量，你可以使用重塑你的数据 (width, height, 1)。

Answer 3

下面的工作代码以防其他人需要 scipy 的仿射变换：

def affine_test(angle=0, translate=(0, 0), shape=(200, 100), buffered_shape=(300, 200), nblob=50):
    # Maxiumum translation allowed is half difference between shape and buffered_shape

    np.random.seed(42)

    # Generate a buffered_shape-sized base image
    base = np.zeros(buffered_shape, dtype=np.float32)
    random_locs = np.random.choice(np.arange(2, buffered_shape[0] - 2), nblob * 2, replace=False)
    i = random_locs[:nblob]
    j = random_locs[nblob:]
    for k, (_i, _j) in enumerate(zip(i, j)):
        base[_i - 2 : _i + 2, _j - 2 : _j + 2] = k + 10

    # Impose a rotation and translation on source
    src = rotate(base, angle, reshape=False, order=1, mode="constant")
    bsc = (np.array(buffered_shape) / 2).astype(int)
    sc = (np.array(shape) / 2).astype(int)
    src = src[
        bsc[0] - sc[0] + translate[0] : bsc[0] + sc[0] + translate[0],
        bsc[1] - sc[1] + translate[1] : bsc[1] + sc[1] + translate[1],
    ]
    # Cut-out destination from the centre of the base image
    dst = base[bsc[0] - sc[0] : bsc[0] + sc[0], bsc[1] - sc[1] : bsc[1] + sc[1]]

    src_y, src_x = src.shape

    def get_matrix_offset(centre, angle, scale):
        """Follows OpenCV.getRotationMatrix2D"""
        angle_rad = angle * np.pi / 180
        alpha = np.round(scale * np.cos(angle_rad), 8)
        beta = np.round(scale * np.sin(angle_rad), 8)
        return (
            np.array([[alpha, beta], [-beta, alpha]]),
            np.array(
                [
                    (1 - alpha) * centre[0] - beta * centre[1],
                    beta * centre[0] + (1 - alpha) * centre[1],
                ]
            ),
        )

    matrix, offset = get_matrix_offset(np.array([((src_y - 1) / 2) - translate[0], ((src_x - 1) / 2) - translate[
    1]]), angle, 1)

    offset += np.array(translate)

    M = np.column_stack((matrix, offset))
    M = np.vstack((M, [0, 0, 1]))
    iM = np.linalg.inv(M)
    imatrix = iM[:2, :2]
    ioffset = iM[:2, 2]

    # Determine the outer bounds of the new image
    lin_pts = np.array([[0, src_y-1, src_y-1, 0], [0, 0, src_x-1, src_x-1]])
    transf_lin_pts = np.dot(matrix, lin_pts) + offset.reshape(2, 1) # - np.array(translate).reshape(2, 1) # both?

    # Find min and max bounds of the transformed image
    min_x = np.floor(np.min(transf_lin_pts[1])).astype(int)
    min_y = np.floor(np.min(transf_lin_pts[0])).astype(int)
    max_x = np.ceil(np.max(transf_lin_pts[1])).astype(int)
    max_y = np.ceil(np.max(transf_lin_pts[0])).astype(int)

    # Add translation to the transformation matrix to shift to positive values
    anchor_x, anchor_y = 0, 0
    if min_x < 0:
        anchor_x = -min_x
    if min_y < 0:
        anchor_y = -min_y

    dot_anchor = np.dot(imatrix, [anchor_y, anchor_x])
    shifted_offset = ioffset - dot_anchor

    # Create padded destination image
    dst_y, dst_x = dst.shape[:2]
    pad_widths = [anchor_y, max(max_y, dst_y) - dst_y, anchor_x, max(max_x, dst_x) - dst_x]
    dst_padded = np.pad(
        dst,
        ((pad_widths[0], pad_widths[1]), (pad_widths[2], pad_widths[3])),
        "constant",
        constant_values=-10,
    )

    dst_pad_y, dst_pad_x = dst_padded.shape
    # Create the aligned and padded source image
    source_aligned = affine_transform(
        src,
        imatrix,
        offset=shifted_offset,
        output_shape=(dst_pad_y, dst_pad_x),
        order=3,
        mode="constant",
        cval=-10,
    )

例如运行:

affine_test(angle=-25, translate=(10, -40))

将显示：

并放大：

抱歉代码写得不是很好。

请注意运行这在野外我注意到它无法处理图像比例大小的任何变化，但我不确定这与我计算转换的方式无关 - 所以一个值得注意的注意事项，并检查一下，如果您要对齐具有不同比例的图像。

填充 scipy affine_transform 输出以显示转换图像的非重叠区域

Padding scipy affine_transform output to show non-overlapping regions of transformed images

python

opencv

numpy

scipy

affinetransform

复杂度分析

准备一些例子

位移搜索

正在搜索角度