获取重叠矩形的边界

Question

我想获取重叠矩形的边界。源图像看起来像：

目标是获取矩形 1-4 的边界：

到目前为止我有以下代码：

import cv2
import numpy as np

img = cv2.imread("template.png")
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
flt = cv2.inRange(hsv,(58,55,218),(59,255,247))
flt = cv2.threshold(flt, 254, 255, cv2.THRESH_BINARY)[1]
flt = cv2.GaussianBlur(flt, (5, 1), 100)

contours, hierarchy = cv2.findContours(flt, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
hierarchy = hierarchy[0]

canvas = img.copy()
for idx, cntr in enumerate(contours):
    if hierarchy[idx][3] != -1:
        arclen = cv2.arcLength(cntr, True)
        approx = cv2.approxPolyDP(cntr, arclen*0.01, True)
        cv2.drawContours(canvas, [approx], -1, (0,0,255), 1, cv2.LINE_AA)

cv2.imshow('template', canvas)
cv2.waitKey(0)
cv2.destroyAllWindows()

returns 我得到了以下结果，它很接近但不完全是我需要的：

我假设应该在每个矩形之间进行某种比较，但我被困在这里。感谢任何帮助。

Answer 1

这篇文章很长post，请耐心等待，因为我试图以连贯的方式解释我的想法。这是我对这个问题的看法。这个想法是从它的角落重建每个矩形。我们首先通过 Hit-or-Miss 操作定位图像上的角点。一旦我们有了角，我们就将它们的坐标点存储在一个列表中。然后我们应用一个详细的算法从这个列表中的点重建N 完整的矩形。步骤是：

阈值图像来自 Otsu 的阈值
应用一系列Hit-or-Miss内核来获得二值图像四个角
根据角图像

角点

应用详细算法进行矩形重建

不要担心最后一点。我在最后一部分解释了我的 unoptimized 和 unvectorized 算法。现在，让我们来看看整个事情的第一部分——角落列表。我正在用这个 first image 测试它。这是代码：

# Imports:
import numpy as np
import cv2

# Set the image path
path = "D://opencvImages//"
fileName = "rectangles1.png"

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)

# Prepare a deep copy for results:
inputImageCopy = inputImage.copy()

# Convert BGR to Grayscale
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# Get image dimensions
(height, width) = binaryImage.shape

第一部分很简单。我们需要一个二进制图像来操作。我们通过应用 Otsu 的阈值得到以下二值图像：

现在，让我们应用Hit-or-Miss操作。这种形态学操作识别匹配给定模式的像素组。该模式由内核给出——一个具有目标二进制模式的 3 x 3 矩阵。我们有四个角——我将应用四种不同的内核。查看 docs 以获取有关操作以及如何指定目标模式的更多信息。这是本节的代码：

# Prepare a list with all possible corner kernels:
kernelList = []

# First corner:
kernel = np.array((
        [-1, -1, 0],
        [-1, 1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Second corner:
kernel = np.array((
        [-1, -1, 0],
        [1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Third corner:
kernel = np.array((
        [-1, 1, 0],
        [-1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Fourth corner:
kernel = np.array((
        [1, -1, 0],
        [-1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

好的。现在，我们将运行一个获取每个内核并将其应用于二进制图像的循环。我们的想法是获得所有找到的角的合成图像，因此我们将构建一个掩码并将其添加到自身——这是因为在在每次循环迭代结束时，都会找到一组可能的角点，我们必须在最终掩码中积累这些信息。这可以通过简单地 OR 串联掩码来实现，因为它是由循环生成的：

# Prepare the image that will hold the corners of the image:
cornerMask = np.zeros((height, width, 1), np.uint8)

# Apply all the kernels to the image:
for i in range(len(kernelList)):
    # Get current kernel>
    currentKernel = kernelList[i]
    # Apply it to the binary image:
    hitMiss = cv2.morphologyEx(binaryImage, cv2.MORPH_HITMISS, currentKernel)
    # Accumulate Mask
    cornerMask = cv2.bitwise_or(hitMiss, cornerMask)

我们最终得到这个角图像，其中所有白色像素都表示“内核匹配”的位置：

该操作还在图像左侧生成了一条白线。不用担心，让我们在左上角 flood-fill (x=0, y=0) 加上黑色来解决这个神器：

# Flood fill at the left top of the image:
cv2.floodFill(cornerMask, mask=None, seedPoint=(int(0),int(0)), newVal=(0))

这是结果：

好的，让我们把这张图片转换成实际坐标。有很多方法可以实现这一目标。我来自 C++ 并且我的 NumPy 知识仍然是绿色的，所以我为此使用了最天真的解决方案：一个搜索白色像素并将其位置存储在列表中的循环。如果你知道如何向量化这个操作，请分享给我，以便我学习！

不过，我的肮脏方法有一个优势，它 光栅扫描 图像从左到右，从上到下，所以我得到一个非常预测的角落table 和已知方式：

# Loop through the image looking for white pixels
# store their location in the points list:    
for j in range(height):
    for i in range(width):
        # Get current pixel:
        currentPixel = cornerMask[j,i]
        # if white, store the coordinates:
        if currentPixel == 255:
            pointsList.append(list((i,j,0)))
                
# Print list:
print(pointsList)

对于第一次测试，这是列表的打印内容：

[[56, 46, 0], [207, 46, 0], [179, 123, 0], [312, 123, 0], [56, 362, 0], [207, 362, 0], [179, 427, 0], [312, 427, 0]]

好了，算法的“核心”来了。不过，在解释之前我想先说明几点：

没有优化——我想在优化之前确保它能正常工作
算法可以重构为递归函数
我已经和 Python 一起工作了 6 个月；正如我之前提到的，我来自 C++ 所以肯定有更多 Pythonic 的方式来做到这一点

尽管如此，请随时优化此算法，将其矢量化，然后与我们分享。反正。我们的想法是在您获得图像中标识角的角点后，我们将尝试使用此信息重建每个矩形。每个矩形将由四对点——角点组成。

让我们分析一下我们目前的角点。以下 table 显示了在将角像素转换为点的光栅扫描过程中构建的点列表：

构成第一个矩形的四个角是 points 列中的蓝色行 – 这些是点 {1,2,5,6}。让我们假设一些事情：

当我们从右到左、从上到下扫描图像时，这些点的顺序是从 smallest x、smallest y、largest x 和 largest y
重建似乎遵循一个已知的顺序：从上到下。也就是说，在列表中，我们首先遇到矩形最左上角的点，然后是最右上角的点。然后是最左下角的点，最后是最右下角的点。我们可以利用它来重建矩形
我们假设我们正在处理一个未旋转的右矩形。这意味着它们的内角始终为 90 度。

考虑到这一点并查看拐角 table，我们注意到每个点至少与另一个点共享一个坐标。例如，P1 与 P2 共享其 y 坐标。 P5 与 P1 共享其 x 坐标。 P5 也与 P6 共享其 y 坐标。 P6由P2的x坐标和P5的y坐标组成。这给了我们几乎 一个方法 来从 table.

重建第一个矩形

假设我们遍历列表。我们获取第一个点，P1。太棒了，这给了我们第一个点，由 x, y 对组成。让我们寻找 P2。我们知道它必须位于同一条垂直线上。从 table 可以清楚地看出 P2 与 P1 共享其 y 坐标。让我们寻找具有相同 y 坐标的点。如果我们从顶部到 boom 遍历 table，我们将在下一次算法迭代中获取 P2。到目前为止，我们将有 P1 和 P2.

让我们寻找下一个点。这将直接位于 P1 下方，因此它必须共享其垂直坐标。让我们寻找与 P1 完全相同的 x 的下一个点（记住 - 从上到下）。然后我们会发现 P5 作为下一个点。不错，只剩一张了。下一点将与 P5 共享其水平坐标——这将引导我们到 P6。 4个角，一个新的矩形！

查看更新点table:

table中的共同坐标由颜色连接：黄色和灰色单元格是不同点之间共享的x坐标。绿色和 orang-ish 单元格共享 y 坐标。

好的，所以，在一次迭代中，我们遍历了列表多次——每次提取一个角点来重建一个矩形。这意味着我们将遍历此列表 4 x number of rectangles on the image。那是很多次。但是，一旦处理了一个点，就没有必要再次重新处理它。此外，我们可以通过标记那些我们知道是矩形一部分的点来避免重复。为此，我使用一个额外的标志来跟踪那些已处理的角。 processed 列标记点，如果角被分配给矩形，则用 1 填充。 0 如果该点当前未处理。

考虑到所有这些，让我们看一下 未优化 代码（请注意变量名与 table I post 无关以上编辑):

# Store the final rectangles here:
rectangleList = []

# Get the total points in the points list:
totalPoints = len(pointsList)

# Traverse the points list:
for a in range(totalPoints):

    # tempRect to store all four corners
    # # of a rectangle:
    tempRect = [None]*4

    # Get the current first point - P1:
    p1 = pointsList[a]

    # Get x, y and isProcessed flag from P1:
    p1x = p1[0]
    p1y = p1[1]
    isProcessed = p1[2]

    # Just process it if it hasn't been processed before:
    if isProcessed == 0:

        # Mark processed in list:
        pointsList[a][2] = 1
        # First point goes into temporal struct:
        tempRect[0] = (p1x, p1y)

        # Let's look for the following point:
        for b in range(totalPoints):

            # Get x, y and isProcessed flag from P2:
            p2 = pointsList[b]
            p2y = p2[1]
            isProcessed = p2[2]

            # Just process it if it hasn't been processed before,
            # We are looking for the point that shares its y:
            if isProcessed == 0 and p2y == p1y:

                # Get P2x:
                p2x = p2[0]

                # Mark processed:
                pointsList[b][2] = 1
                # Second point goes into temporal struct:
                tempRect[1] = (p2x, p2y)

                # Let's look for the following point:
                for c in range(totalPoints):

                    # Get x, y and isProcessed flag from P3:
                    p3 = pointsList[c]
                    p3x = p3[0]
                    isProcessed = p3[2]

                    # Just process it if it hasn't been processed before,
                    # We are looking for the point that shares its x with P1:
                    if isProcessed == 0 and p1x == p3x:

                        # Get P2y:
                        p3y = p3[1]

                        # Mark processed:
                        pointsList[c][2] = 1
                        # Third point goes into temporal struct:
                        tempRect[2] = (p3x, p3y)

                        # Let's look for the following point:
                        for d in range(totalPoints):

                            # Get x, y and isProcessed flag from P4:
                            p4 = pointsList[d]
                            p4y = p4[1]
                            isProcessed = p4[2]

                            # Just process it if it hasn't been processed before,
                            # We are looking for the point that shares its y with P3:
                            if isProcessed == 0 and p3y == p4y:

                                # Get P4y:
                                p4x = p4[0]

                                # Mark processed:
                                pointsList[d][2] = 1
                                # Fourth point goes into temporal struct:
                                tempRect[3] = (p4x, p4y)

                                # We now have a full rectangle, store it in the
                                # rectangle list:
                                rectangleList.append(tempRect)

非常简单。请注意，我们至少循环遍历点列表四次。每次我们寻找角点时都会有一个循环。每当一个点满足我们的 搜索条件 时，我们将其存储在时间数组中。最后，在处理了 4 个点之后，我们最终可以将时间数组（完整的矩形）存储在一个单独的列表中。同样，这里有很多可以优化的地方。

让我们画出矩形。接下来的位只是循环遍历矩形列表，获取矩形并获取用于绘制的对角线。请注意，我只是使用找到的四个角中的两个角。如果您需要所有角落，您可以使用该信息。还有一点需要进一步优化。让我们使用随机颜色绘制矩形：

# Finally, draw the rectangles:
for r in range(len(rectangleList)):
    # Get current rectangle:
    currentRect = rectangleList[r]
    # Set rectangle:
    x1 = currentRect[0][0]
    y1 = currentRect[0][1]
    x2 = currentRect[3][0]
    y2 = currentRect[3][1]

    # Set a random BGR color for the rectangle:
    color = (np.random.randint(low=0, high=256), np.random.randint(low=0, high=256), np.random.randint(low=0, high=256))
    # Draw the rectangle:
    cv2.rectangle(inputImageCopy, (int(x1), int(y1)), (int(x2), int(y2)), color, 2)

    cv2.imshow("Rects", inputImageCopy)
    cv2.waitKey(0)

这是两个矩形的图像：

和three rectangles：

获取重叠矩形的边界

Get boundaries of overlapping rectangles

python

opencv