使用 warpAffine 将拼接的图像显示在一起而不会被截断

Question

我正在尝试通过使用模板匹配将 2 个图像拼接在一起找到我传递给 cv2.getAffineTransform() 的 3 组点得到一个我传递给 cv2.warpAffine() 的扭曲矩阵以对齐我的图像.

然而，当我加入我的图像时，我的大部分仿射图像都没有显示。我已经尝试使用不同的技术来 select 点，更改顺序或参数等，但我只能显示仿射图像的一小部分。

有人可以告诉我我的方法是否有效并指出我可能在哪里出错吗？任何关于可能导致问题的猜测将不胜感激。提前致谢。

这是 final result that I get. Here are the original images (1, 2) 和我使用的代码：

编辑：这是变量 trans

的结果

array([[  1.00768049e+00,  -3.76690353e-17,  -3.13824885e+00],
       [  4.84461775e-03,   1.30769231e+00,   9.61912797e+02]])

这里是传递给 cv2.getAffineTransform 的点数：unified_pair1

array([[  671.,  1024.],
       [   15.,   979.],
       [   15.,   962.]], dtype=float32)

unified_pair2

array([[ 669.,   45.],
       [  18.,   13.],
       [  18.,    0.]], dtype=float32)

import cv2
import numpy as np


def showimage(image, name="No name given"):
    cv2.imshow(name, image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    return

image_a = cv2.imread('image_a.png')
image_b = cv2.imread('image_b.png')


def get_roi(image):
    roi = cv2.selectROI(image) # spacebar to confirm selection
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])]
    return crop
temp_1 = get_roi(image_a)
temp_2 = get_roi(image_a)
temp_3 = get_roi(image_a)

def find_template(template, search_image_a, search_image_b):
    ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED)
    template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max())

    ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED)
    template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max())
    return template_loc_a, template_loc_b


coord_a1, coord_b1 = find_template(temp_1, image_a, image_b)
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b)
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b)

def unnest_list(coords_list):
    coords_list = [a[0] for a in coords_list]
    return coords_list

coord_a1 = unnest_list(coord_a1)
coord_b1 = unnest_list(coord_b1)
coord_a2 = unnest_list(coord_a2)
coord_b2 = unnest_list(coord_b2)
coord_a3 = unnest_list(coord_a3)
coord_b3 = unnest_list(coord_b3)

def unify_coords(coords1,coords2,coords3):
    unified = []
    unified.extend([coords1, coords2, coords3])
    return unified

# Create a 2 lists containing 3 pairs of coordinates
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3)
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3)

# Convert elements of lists to numpy arrays with data type float32
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32)
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32)

# Get result of the affine transformation
trans = cv2.getAffineTransform(unified_pair1, unified_pair2)

# Apply the affine transformation to original image
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0]))
result[0:image_b.shape[0], image_b.shape[1]:] = image_b

showimage(result)
cv2.imwrite('result.png', result)

来源：基于从文档 , this tutorial and this example 收到的建议的方法。

Answer 1

7 月 12 日编辑：

这个 post 启发了 GitHub repos 提供了完成这个任务的功能；一个用于填充 warpAffine()，另一个用于填充 warpPerspective()。查看 Python version or the C++ version.

变换移动像素的位置

任何转换所做的就是获取您的点坐标 (x, y) 并将它们映射到新位置 (x', y'):

s*x'    h1 h2 h3     x
s*y' =  h4 h5 h6  *  y
s       h7 h8  1     1

其中 s 是一些比例因子。您必须将新坐标除以比例因子才能得到正确的像素位置 (x', y')。从技术上讲，这仅适用于单应性---(3, 3) 变换矩阵---你不需要缩放仿射变换（你甚至不需要使用齐次坐标......但最好保持本次讨论一般）。

然后将实际像素值移动到这些新位置，并对颜色值进行插值以适应新的像素网格。所以在这个过程中，这些新位置会在某个时候被记录下来。我们需要这些位置来查看像素相对于其他图像实际移动到的位置。让我们从一个简单的例子开始，看看点被映射到哪里。

假设您的变换矩阵只是将像素向左移动十个像素。翻译由最后一列处理；第一行是 x 中的翻译，第二行是 y 中的翻译。所以我们会有一个单位矩阵，但第一行第三列有 -10。像素 (0,0) 会被映射到哪里？希望 (-10,0) 如果逻辑合理。事实上，确实如此：

transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]])
homg_pt = np.array([0,0,1])
new_homg_pt = transf.dot(homg_pt))
new_homg_pt /= new_homg_pt[2]
# new_homg_pt = [-10.  0.  1.]

完美！所以我们可以用一点线性代数计算出 all 点映射的位置。我们需要获取所有 (x,y) 点，并将它们放入一个巨大的数组中，以便每个点都在它自己的列中。假设我们的图片只有 4x4.

h, w = src.shape[:2] # 4, 4
indY, indX = np.indices((h,w))  # similar to meshgrid/mgrid
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))

现在这些lin_homg_pts都有同质点：

[[ 0.  1.  2.  3.  0.  1.  2.  3.  0.  1.  2.  3.  0.  1.  2.  3.]
 [ 0.  0.  0.  0.  1.  1.  1.  1.  2.  2.  2.  2.  3.  3.  3.  3.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]]

然后我们可以做矩阵乘法得到每个点的映射值。为简单起见，让我们坚持之前的单应性。

trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]

现在我们有了转换点：

[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.]
 [  0.  0.  0.  0.   1.  1.  1.  1.   2.  2.  2.  2.   3.  3.  3.  3.]
 [  1.  1.  1.  1.   1.  1.  1.  1.   1.  1.  1.  1.   1.  1.  1.  1.]]

如我们所见，一切都按预期工作：我们仅将 x 值移动了 -10。

像素可以移动到图像边界之外

请注意，这些像素位置是负的——它们在图像边界之外。如果我们做一些更复杂的事情并将图像旋转 45 度，我们将得到一些超出原始边界的像素值。我们并不关心每个像素值，我们只需要知道原始图像像素位置之外的最远像素，这样我们就可以在显示扭曲图像之前将原始图像填充到那么远.

theta = 45*np.pi/180
transf = np.array([
    [ np.cos(theta),np.sin(theta),0],
    [-np.sin(theta),np.cos(theta),0],
    [0.,0.,1.]])
print(transf)
trans_lin_homg_pts = transf.dot(lin_homg_pts)
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,

所以我们看到我们可以在原始图像的正负方向上很好地获取像素位置。最小 x 值不会改变，因为当单应性应用旋转时，它是从左上角开始的。现在这里要注意的一件事是我已经将转换应用于图像中的所有像素。但这真的没有必要，你可以简单地扭曲四个角点，看看它们落在哪里。

填充目标图像

请注意，当您调用 cv2.warpAffine() 时，您必须输入目标尺寸。这些转换后的像素值引用该大小。因此，如果一个像素被映射到 (-10,0)，它就不会出现在目标图像中。这意味着我们必须使用将所有像素位置移动为正的平移进行另一个单应性，然后我们可以填充图像矩阵以补偿我们的移动。如果单应性将点移动到比图像大的位置，我们还必须在底部和右侧填充原始图像。

在最近的示例中，最小 x 值相同，因此我们不需要水平移动。但是，最小 y 值下降了大约两个像素，因此我们需要将图像向下移动两个像素。首先，让我们创建填充的目标图像。

pad_sz = list(src.shape) # in case three channel
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# pad_sz = [6, 4, 3]

正如我们所见，高度从原始高度增加了两个像素以解决该偏移。

向转换添加平移以将所有像素位置移动到正值

现在，我们需要创建一个新的单应矩阵来将扭曲的图像平移与我们移动的量相同。为了应用这两个变换——原始的和这个新的变换——我们必须组合两个单应性（对于仿射变换，你可以简单地添加翻译，但不是单应性）。此外，我们需要除以最后一个条目以确保比例仍然正确（同样，仅适用于同形异义词）：

anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0: 
    anchorX = np.round(-minX).astype(int)
    transl_transf[0,2] -= anchorX
if minY < 0:
    anchorY = np.round(-minY).astype(int)
    transl_transf[1,2] -= anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]

我还在这里创建了锚点，用于将目标图像放入填充矩阵的位置；它移动的量与单应性移动图像的量相同。因此，让我们将目标图像放在填充矩阵中：

dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst

使用新变换变形为填充图像

我们剩下要做的就是将新的变换应用到源图像（具有填充的目标大小），然后我们可以叠加两个图像。

warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))

alpha = 0.3
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)

综合起来

让我们为此创建一个函数，因为我们在这里创建了很多不需要的变量。对于输入，我们需要源图像、目标图像和原始单应性。对于输出，我们只需要填充的目标图像和扭曲的图像。请注意，在示例中我们使用了 3x3 单应性，因此我们最好确保发送 3x3 变换而不是 2x3 仿射或欧几里得扭曲。您可以将 [0,0,1] 行添加到底部的任何仿射扭曲中，这样就没问题了。

def warpPerspectivePadded(img, dst, transf):

    src_h, src_w = src.shape[:2]
    lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]])

    trans_lin_homg_pts = transf.dot(lin_homg_pts)
    trans_lin_homg_pts /= trans_lin_homg_pts[2,:]

    minX = np.min(trans_lin_homg_pts[0,:])
    minY = np.min(trans_lin_homg_pts[1,:])
    maxX = np.max(trans_lin_homg_pts[0,:])
    maxY = np.max(trans_lin_homg_pts[1,:])

    # calculate the needed padding and create a blank image to place dst within
    dst_sz = list(dst.shape)
    pad_sz = dst_sz.copy() # to get the same number of channels
    pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int)
    pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int)
    dst_pad = np.zeros(pad_sz, dtype=np.uint8)

    # add translation to the transformation matrix to shift to positive values
    anchorX, anchorY = 0, 0
    transl_transf = np.eye(3,3)
    if minX < 0: 
        anchorX = np.round(-minX).astype(int)
        transl_transf[0,2] += anchorX
    if minY < 0:
        anchorY = np.round(-minY).astype(int)
        transl_transf[1,2] += anchorY
    new_transf = transl_transf.dot(transf)
    new_transf /= new_transf[2,2]

    dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst

    warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))

    return dst_pad, warped

运行函数示例

最后，我们可以用一些真实的图像和单应性来调用这个函数，看看结果如何。我将从 LearnOpenCV:

借用示例

src = cv2.imread('book2.jpg')
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32)
dst = cv2.imread('book1.jpg')
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32)

transf = cv2.getPerspectiveTransform(pts_src, pts_dst)

dst_pad, warped = warpPerspectivePadded(src, dst, transf)

alpha = 0.5
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
cv2.imshow("Blended Warped Image", blended)
cv2.waitKey(0)

我们最终得到了这张经过填充的扭曲图像：

![[填充和变形1]1

与您通常会得到的 typical cut off warp 相反。

使用 warpAffine 将拼接的图像显示在一起而不会被截断

Displaying stitched images together without cutoff using warpAffine

opencv

image-stitching

7 月 12 日编辑：

变换移动像素的位置

像素可以移动到图像边界之外

填充目标图像

向转换添加平移以将所有像素位置移动到正值

使用新变换变形为填充图像

综合起来

运行函数示例

使用 warpAffine 将拼接的图像显示在一起而不会被截断

Displaying stitched images together without cutoff using warpAffine

opencv

image-stitching

7 月 12 日编辑：

变换移动像素的位置

像素可以移动到图像边界之外

填充目标图像

向转换添加平移以将所有像素位置移动到正值

使用新变换变形为填充图像

综合起来

运行 函数示例

运行函数示例