如何在 OpenCV 中获取这些图像的 ROI

Question

我有一些示例图片如下

我想做的是从图像中删除标签，因此生成的图像应如下所示

最后我想得到如图所示的矩形

到目前为止，我已经有了采用模板并删除边框以获得第一个结果的代码

import cv2
import numpy as np


def remove_templates(image):
    templates = ['images/sample1.jpeg', 'images/sample2.jpeg']
    for template in templates:
        template = cv2.imread(template)
        h, w, _ = template.shape
        res = cv2.matchTemplate(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), cv2.cvtColor(template, cv2.COLOR_BGR2GRAY), cv2.TM_CCOEFF)
        min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
        top_left = max_loc
        bottom_right = (top_left[0] + w, top_left[1] + h)
        cv2.rectangle(img, top_left, bottom_right, (1, 1, 1), -1)


def crop_borders(img):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    gray = 255 * (gray < 128).astype(np.uint8)  # To invert the text to white
    gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, np.ones((2, 2), dtype=np.uint8))  # Perform noise filtering
    canny = cv2.Canny(gray, 0, 150)
    coords = cv2.findNonZero(canny)  # Find all non-zero points (text)
    x, y, w, h = cv2.boundingRect(coords)  # Find minimum spanning bounding box
    rect = img[y:y + h, x:x + w + 20]  # Crop the image - note we do this on the original image
    return rect


img = cv2.imread('images/res5.jpg')
remove_templates(img)
img = crop_borders(img)

cv2.imwrite('output/op1.png', img)
cv2.imwrite('output/op2.png', cv2.cvtColor(img, cv2.COLOR_BGR2GRAY))


height = img.shape[0]
width = img.shape[1]
# Cut the image in half
width_cutoff = (width // 2)
left = img[:, :width_cutoff+5]
right = img[:, width_cutoff+25:]


cv2.imwrite('output/left.png', left)
cv2.imwrite('output/right.png', right)

上面的代码确实给了我第一个结果，但是当徽标的纵横比或大小不同时失败了。

我怎样才能达到同样的效果，任何帮助都会很有帮助。

我是 opencv 的新手，所以任何方向都会有所帮助。我现在拥有的大部分代码都是从不同的教程中挑选部分。如果代码中有问题，请指导我。

Answer 1

概念

定义一个函数，将 BGR 图像处理为增强框边缘的二值图像。
定义一个函数，该函数接受 BGR 图像和 returns 从图像中检测到的轮廓（使用前一个函数处理）在特定区域范围内。
绘制每个轮廓的边界框，并裁剪图像，连接所有轮廓并获得所有轮廓的边界框以用于切片图像。

代码

import cv2
import numpy as np

def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray[img_gray < 5] = 255
    return cv2.dilate(cv2.Canny(img_gray, 50, 75), np.ones((4, 4)), iterations=2)

def get_cnts(img):
    cnts, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    return [cnt for cnt in cnts if 80000 > cv2.contourArea(cnt) > 40000]
    
img = cv2.imread("image.png")
cnts = get_cnts(img)

for cnt in cnts:
    x, y, w, h = cv2.boundingRect(cnt)
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 3)

x, y, w, h = cv2.boundingRect(np.concatenate(cnts))
cv2.imshow("Image", img[y: y + h, x: x + w])

cv2.waitKey(0)
cv2.destroyAllWindows()

输出

以下是所提供的两个示例图像的结果图像：

解释

导入所有必要的模块：

import cv2
import numpy as np

定义一个函数，process()，它将 BGR 图像数组作为参数：

def process(img):

处理从将图像转换为灰度开始。然后我们把灰度数组中每一个小于5的值都替换成更大的（我用的是255）。这样做的原因是为了使图像的背景变亮，以便更容易检测到框的边缘：

    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray[img_gray < 5] = 255

我们现在可以使用 canny 边缘检测器来检测框的边缘。 2个膨胀的迭代将很好地增强检测到的边缘。然后 return 膨胀图像 （二进制格式）:

    return cv2.dilate(cv2.Canny(img_gray, 50, 75), np.ones((4, 4)), iterations=2)

定义一个函数 get_pts()，它将 BGR 图像数组作为参数：

def get_cnts(img):

使用cv2.findContours()方法，我们找到图像的轮廓（使用我们之前定义的process()函数处理），和return 面积大于 40000 且面积小于 80000 的所有等高线的列表。明显不同的框尺寸将需要调整这些值：

    cnts, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    return [cnt for cnt in cnts if 80000 > cv2.contourArea(cnt) > 40000]

读取图像文件，获取其轮廓，并使用cv2.boundingRect() and cv2.rectangle()方法绘制每个轮廓的边界矩形：

img = cv2.imread("image.png")
cnts = get_cnts(img)

for cnt in cnts:
    x, y, w, h = cv2.boundingRect(cnt)
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 3)

最后，裁剪图像，得到列表中所有轮廓组合的边界矩形（可以通过np.concatenate()方法连接轮廓来完成） 并显示结果：

x, y, w, h = cv2.boundingRect(np.concatenate(cnts))
cv2.imshow("Image", img[y: y + h, x: x + w])

cv2.waitKey(0)
cv2.destroyAllWindows()

如何在 OpenCV 中获取这些图像的 ROI

How to get these ROIs for these images in OpenCV

python

opencv

tesseract

概念

代码

输出

解释