使用 python 和 opencv 在图像中查找区域

Question

我想在大约 1,500 张格式相似的图像中找到一个区域。它们都是人物绘画或摄影图像的扫描件。它们都具有相同的色卡。色卡可以放在图像的任一侧（参见下面的示例图像）。

结果应该是一张图片，只包含人物肖像。

我能找到opencv模板匹配的色卡：

import cv2
import numpy as np

method = cv2.TM_SQDIFF_NORMED

# Read the images from the file
img_rgb = cv2.imread('./imgs/test_portrait.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

template = cv2.imread('./portraet_color_card.png', 0)
w, h = template.shape[::-1]

result = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)

threshold = .97
loc = np.where(result >= threshold)
for pt in zip(*loc[::-1]):
   print("Found:", pt)
   cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)

cv2.imwrite('result.png',img_rgb)

输出：

Found: (17, 303)
Found: (18, 303)
Found: (17, 304)
Found: (18, 304)

根据坐标和图像尺寸，我可以确定图像是左还是右，并且可以裁剪图像。结果远非完美，因为边界仍然存在。

有没有更好的方法从图像中提取人像？我更愿意使用 python 和 opencv，但我愿意接受其他关于如何解决大量图像问题的建议。

样本：

模板：

Answer 1

首先，假设您至少有 15K 张图像，因此需要花费宝贵的时间将其自动化（1,5K 可以手动处理）。我将尝试定义一个高级方法并提供一些 PoC 结果（抱歉，没有代码，我使用自定义 CV tool/pipeline）。

正如您提到的，卡片的背景颜色各不相同，所以让我们保险起见：颜色卡片包含一些特定的颜色。我会将它们用作初始“密钥”。颜色是独一无二的，所以我可以定义适当的阈值以使我的结果稳定：

两个分段单元格为我们提供了一种非常简单的验证方法（比较尺寸、相对位置等）。此时我们可以轻松找到色卡背景（最好在已识别的色块附近进行多次测量）：

如您所见，有一些噪声、有损压缩伪影会影响结果，但它仍然足够好（另一种验证可能性：比较单元格和卡片大小）。此时，我们可以进行额外的测量以找到背景的颜色。

我们先回顾一下简单的案例：结果似乎已经足够好了，所以最终裁剪和小正确性很容易实现：

有些情况不会那么简单：

我建议在验证规则上投入更多时间并手动处理所有棘手的案例，但再花一些时间也可以解决“常见的棘手问题”。

无论如何，这里是一个简短的总结：

使用关键颜色可靠地识别色卡（并进行初始验证）
进行多次测量以找到色卡背景（因此您可以使用较小的阈值）
进行多次测量以定义图像背景
validate strategy是必须的，这样手动处理一些少量的剩菜会更容易

PS: white on white 很有趣，但是Kazimir Malevich did that 很久以前，不需要重复:)

Answer 2

此解决方案假设肖像是图像中最大的图案

解决步骤顺序：

经典图像处理从图像中获取重要特征：

转换为灰度级。
高斯模糊以减少噪点并平滑图像。
边缘检测，在我的案例中使用 Canny。
形态学膨胀将特征分为两个主要模式。
最大连通分量检测（归功于旧）
剩下的就是屏蔽最大的连通分量。

Note that this solution has some assumptions, hence generalization might not always work!, but I have tested this solution with the given images.

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import cv2
import numpy as np

class ImgProcessor:
    def __init__(self, path, imName):
        self.path = path
        self.imName = imName
        self.original = cv2.imread(self.path+self.imName)

    def imProcess(self, ksmooth=7, kdilate=3, thlow=50, thigh= 100):
        # Read Image in BGR format
        img_bgr = self.original.copy()
        # Convert Image to Gray
        img_gray= cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
        # Gaussian Filtering for Noise Removal
        gauss = cv2.GaussianBlur(img_gray, (ksmooth, ksmooth), 0)
        # Canny Edge Detection
        edges = cv2.Canny(gauss, thlow, thigh, 10)
        # Morphological Dilation
        # TODO: experiment diferent kernels
        kernel = np.ones((kdilate, kdilate), 'uint8')
        dil = cv2.dilate(edges, kernel)

        return dil
    
    def largestCC(self, imBW):
        # Extract Largest Connected Component
        # Source: 
        image = imBW.astype('uint8')
        nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=4)
        sizes = stats[:, -1]

        max_label = 1
        max_size = sizes[1]
        for i in range(2, nb_components):
            if sizes[i] > max_size:
                max_label = i
                max_size = sizes[i]

        img2 = np.zeros(output.shape)
        img2[output == max_label] = 255
        return img2
    
    def maskCorners(self, mask, outval=1):
        y0 = np.min(np.nonzero(mask.sum(axis=1))[0])
        y1 = np.max(np.nonzero(mask.sum(axis=1))[0])
        x0 = np.min(np.nonzero(mask.sum(axis=0))[0])
        x1 = np.max(np.nonzero(mask.sum(axis=0))[0])
        output = np.zeros_like(mask)
        output[y0:y1, x0:x1] = outval
        return output

    def extractROI(self):
        im = self.imProcess()
        lgcc = self.largestCC(im)
        lgcc = lgcc.astype(np.uint8)
        roi = self.maskCorners(lgcc)
        # TODO mask BGR with this mask
        exroi = cv2.bitwise_and(self.original, self.original, mask = roi)
        return exroi

    def show_res(self):
        result = self.extractROI()
        cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
        cv2.imshow("Result", result)
        cv2.waitKey(0)

# ==============================================
if __name__ == "__main__":
    # TODO: change the path, and image name to suit your needs
    impr_ = ImgProcessor(path="/home/", imName="img.png")
    res = impr_.show_res()

使用 python 和 opencv 在图像中查找区域

Find area in image with python and opencv

python

opencv

image-processing

解决步骤顺序：