
bounding boxes on handwritten digits with opencv


我现在能想到的最好的想法是过滤图像中除图像轮廓本身之外的 4 个最大轮廓。


import sys
import numpy as np
import cv2

im = cv2.imread('marks/mark28.png')
im3 = im.copy()

gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)

#################      Now finding Contours         ###################

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

samples = np.empty((0, 100))
responses = []
keys = [i for i in range(48, 58)]

for cnt in contours:
    if cv2.contourArea(cnt) > 50:
        [x, y, w, h] = cv2.boundingRect(cnt)
        if h > 28:
            cv2.rectangle(im, (x, y), (x + w, y + h), (0, 0, 255), 2)
            roi = thresh[y:y + h, x:x + w]
            roismall = cv2.resize(roi, (10, 10))
            cv2.imshow('norm', im)
            key = cv2.waitKey(0)

            if key == 27:  # (escape to quit)
            elif key in keys:
                sample = roismall.reshape((1, 100))
                samples = np.append(samples, sample, 0)

    responses = np.array(responses, np.float32)
    responses = responses.reshape((responses.size, 1))
    "training complete"

    np.savetxt('generalsamples.data', samples)
    np.savetxt('generalresponses.data', responses)

我可能需要更改关于高度的 if 条件,但更重要的是我需要 if 条件以获得图像上的 4 个最大轮廓。遗憾的是,我还没有设法找到我应该过滤的内容。

This is the kind of results 我明白了,我正试图避免在数字“零”上得到那些内部轮廓

未按要求处理的图像:example 1 example 2


你快搞定了。每个数字上都有多个边界矩形,因为您正在检索每个轮廓(外部和内部)。您正在 RETR_LIST 模式下使用 cv2.findContours,该模式会检索 所有轮廓,但不会创建任何 parent-child 关系 。 parent-child 关系区分内部(子)和外部(父)轮廓,OpenCV 称之为 “轮廓层次结构”。查看 docs 以获得所有层次结构模式的概述。特别感兴趣的是 RETR_EXTERNAL 模式。此模式仅获取外部轮廓 - 因此您不会获得多个轮廓和(通过扩展)每个数字的多个边界框!

另外,您的图片似乎有红色边框。这将在对图像进行阈值处理时引入噪声,并且该边界可能被识别为 top-level 外轮廓 - 因此,所有其他轮廓(该父轮廓的子轮廓)将不会在 RETR_EXTERNAL 模式下获取。幸运的是,边界位置似乎是恒定的,我们可以用一个简单的 flood-fill 来消除它,它几乎用替代颜色填充目标颜色的斑点。


# Imports:
import cv2
import numpy as np

# Set image path
path = "D://opencvImages//"
fileName = "rhWM3.png"

# Read Input image
inputImage = cv2.imread(path+fileName)

# Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)


注意边框是白色的。我们必须删除它,一个简单的 flood-filling 在位置 (x=0,y=0) 的黑色就足够了:

# Flood-fill border, seed at (0,0) and use black (0) color:
cv2.floodFill(binaryImage, None, (0, 0), 0)


现在我们可以在 RETR_EXTERNAL 模式下检索最外层的轮廓:

# Get each bounding box
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

请注意,您还将每个轮廓的 hierarchy 作为第二个 return 值。如果您想检查当前轮廓是 parent 还是 child,这很有用。好吧,让我们遍历轮廓并获取它们的边界框。如果要忽略最小面积阈值以下的轮廓,还可以实现面积过滤器:

# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):

    # Get the bounding rectangle of the current contour:
    boundRect = cv2.boundingRect(c)

    # Get the bounding rectangle data:
    rectX = boundRect[0]
    rectY = boundRect[1]
    rectWidth = boundRect[2]
    rectHeight = boundRect[3]

    # Estimate the bounding rect area:
    rectArea = rectWidth * rectHeight

    # Set a min area threshold
    minArea = 10

    # Filter blobs by area:
    if rectArea > minArea:

        # Draw bounding box:
        color = (0, 255, 0)
        cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
                      (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
        cv2.imshow("Bounding Boxes", inputImageCopy)

        # Crop bounding box:
        currentCrop = inputImage[rectY:rectY+rectHeight,rectX:rectX+rectWidth]
        cv2.imshow("Current Crop", currentCrop)
