opencv findContours 遗漏了一些区域。[没有得到所有正确的边界框]

Question

我是opencv新手，从简单的验证码中提取字符开始学习。经过一些努力，我得到了 findContours 和一些清理图像的方法，有时有效，但不是更频繁。

例如：

我有一张原图（已经放大了）：
转换为灰度并使用 cv2.threshold 清洁：
使用cv2.findContours获取边界框：

W只覆盖了一半，没有得到b.

我的代码：

from StringIO import StringIO
import string

from PIL import Image
import requests
import cv2
import numpy as np
import matplotlib.pyplot as plt

def get_ysdm_captcha():
    url = 'http://www.ysdm.net/common/CleintCaptcha'
    r = requests.get(url)
    img = Image.open(StringIO(r.content))
    return img

def scale_image(img, ratio):
    return img.resize((int(img.width*ratio), int(img.height*ratio)))

def draw_rect(im):
    im = np.array(im)

    if len(im.shape) == 3 and im.shape[2] == 3:
        imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
    else:
        imgray = im

    #plt.imshow(Image.fromarray(imgray), 'gray')
    pilimg = Image.fromarray(imgray)
    ret,thresh = cv2.threshold(imgray,127,255,0)

    threimg = Image.fromarray(thresh)

    plt.figure(figsize=(4,3))
    plt.imshow(threimg, 'gray')
    plt.xticks([]), plt.yticks([])

    contours, hierarchy = cv2.findContours(np.array(thresh),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    areas = []

    for c in contours:
        rect = cv2.boundingRect(c)
        area = cv2.contourArea(c)
        areas.append(area)
        x,y,w,h = rect

        if area > 2000 or area < 200 : continue

        cv2.rectangle(thresh,(x,y),(x+w,y+h),(0,255,0),1)
        plt.figure(figsize=(1,1))
        plt.imshow(threimg.crop((x,y,x+w,y+h)), 'gray')
        plt.xticks([]), plt.yticks([])

    plt.figure(figsize=(10,10))

    plt.figure()
    plt.imshow(Image.fromarray(thresh), 'gray')
    plt.xticks([]), plt.yticks([])


image = get_ysdm_captcha()
im = scale_image(image, 3)
im = np.array(im)

imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray,(5,5),0)
# im = cv2.medianBlur(imgray,9)
# im = cv2.bilateralFilter(imgray,9,75,75)

draw_rect(imgray)

我尽力写了上面的代码。我想象的解决方案是：

找到有什么方法可以告诉 cv2.findContours 我需要 4 一些尺寸的边界框
尝试了一些不同的参数（我尝试了 http://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours#findcontours 中的所有参数，但仍然无效）

现在我卡住了，不知道如何改进cv2.findContours...

Answer 1

您可以使用形态学操作来修改图像并填补空白，例如 erode 和 dilate

看这里： http://docs.opencv.org/2.4/doc/tutorials/imgproc/erosion_dilatation/erosion_dilatation.html

原文：

膨胀：

顺便说一句：我会在原始图像中实施 HSV 分离步骤，删除所有 'white/grey/black' 内容（低饱和度）。这将减少斑点的数量。在转换为灰度之前执行此操作。

过滤结果如下：饱和度 > 90

最终结果：（之前添加了模糊步骤）

此外，如果始终存在渐变，您可以检测到这一点并过滤掉更多颜色。但如果你刚刚开始图像处理，那就有点多了;)

Answer 2

findCountours 可以正常工作，因为它会找到图像的所有连接组件。例如，您的区域条件可能会避免您在字母 b 周围得到一个边界框。当然，如果你在每个连接的组件周围放置一个边界框，你最终不会在每个字符周围都有一个边界框，因为你的字母中有很多洞。

如果你想分割字母，我会先尝试打开操作（因为你的字母是白底黑字，如果相反它会关闭）以填补空洞你有在你的信中。然后我会垂直投射像素并分析你得到的形状。如果您在此投影形状中找到谷点，您将获得字符之间的垂直限制。你可以水平地做同样的事情来获得你的字符的上限和下限。这种方法只有在文本是水平的情况下才有效。如果不是，你应该找到你的字符串的主轴角度，你可以相应地旋转图像。要找到主轴角，您可以将椭圆适合您的文本并找到它的主轴角，或者您可以将图像旋转一定角度，直到您的水平投影最大。

opencv findContours 遗漏了一些区域。[没有得到所有正确的边界框]

opencv findContours miss some area.[ not get all correct bounding boxes ]

python

opencv

extract

image-processing

image-recognition

例如：

我的代码：