如何在 OpenCV2 中将阈值拆分为正方形？

Question

我有一张可爱的魔方图片：

我想把它分成正方形，并识别每个正方形的颜色。我可以在其上运行高斯模糊，然后在 'Dilate' 结束之前进行 'Canny' 以获得以下内容：

这看起来不错，但我无法将其变成正方形。我尝试的任何一种 'findContours' 都只会显示一个或两个方块。离我的目标九点还差得很远。除了这个，人们对我还能做些什么有什么想法吗？

当前最佳解决方案：

代码如下，需要numpy + opencv2。它需要一个名为 './sides/rubiks-side-F.png' 的文件，并将多个文件输出到 'steps' 文件夹。

import numpy as np
import cv2 as cv

def save_image(name, file):
    return cv.imwrite('./steps/' + name + '.png', file)


def angle_cos(p0, p1, p2):
    d1, d2 = (p0-p1).astype('float'), (p2-p1).astype('float')
    return abs(np.dot(d1, d2) / np.sqrt(np.dot(d1, d1)*np.dot(d2, d2)))

def find_squares(img):
    img = cv.GaussianBlur(img, (5, 5), 0)
    squares = []
    for gray in cv.split(img):
        bin = cv.Canny(gray, 500, 700, apertureSize=5)
        save_image('post_canny', bin)
        bin = cv.dilate(bin, None)
        save_image('post_dilation', bin)
        for thrs in range(0, 255, 26):
            if thrs != 0:
                _retval, bin = cv.threshold(gray, thrs, 255, cv.THRESH_BINARY)
                save_image('threshold', bin)
            contours, _hierarchy = cv.findContours(
                bin, cv.RETR_LIST, cv.CHAIN_APPROX_SIMPLE)
            for cnt in contours:
                cnt_len = cv.arcLength(cnt, True)
                cnt = cv.approxPolyDP(cnt, 0.02*cnt_len, True)
                if len(cnt) == 4 and cv.contourArea(cnt) > 1000 and cv.isContourConvex(cnt):
                    cnt = cnt.reshape(-1, 2)
                    max_cos = np.max(
                        [angle_cos(cnt[i], cnt[(i+1) % 4], cnt[(i+2) % 4]) for i in range(4)])
                    if max_cos < 0.2:
                        squares.append(cnt)
    return squares

img = cv.imread("./sides/rubiks-side-F.png")
squares = find_squares(img)
cv.drawContours(img, squares, -1, (0, 255, 0), 3)
save_image('squares', img)

你可以找到其他边here

Answer 1

我知道你可能不会接受这个答案，因为它是用 C++ 写的。没关系;我只想向您展示一种可能的方块检测方法。如果您希望将此代码移植到 Python.

，我将尝试包含尽可能多的细节

目标是尽可能准确检测所有 9 个方块。这些是步骤：

获取完整立方体轮廓所在的边缘蒙版清晰可见。
过滤这些边以获得二进制立方体（分割）掩码。
使用立方体蒙版获取立方体的边界box/rectangle。
使用边界矩形获取尺寸和位置每个正方形（所有正方形的尺寸都不变）。

首先，我将尝试应用您描述的步骤来获得 边缘蒙版 。我只是想确保我的起点与您目前所在的位置相似。

管道是这样的：read the image > grayscale conversion > Gaussian Blur > Canny Edge detector:

    //read the input image:
    std::string imageName = "C://opencvImages//cube.png";
    cv::Mat testImage =  cv::imread( imageName );

    //Convert BGR to Gray:
    cv::Mat grayImage;
    cv::cvtColor( testImage, grayImage, cv::COLOR_RGB2GRAY );

   //Apply Gaussian blur with a X-Y Sigma of 50:
    cv::GaussianBlur( grayImage, grayImage, cv::Size(3,3), 50, 50 );

    //Prepare edges matrix:
    cv::Mat testEdges;

    //Setup lower and upper thresholds for edge detection:
    float lowerThreshold = 20;
    float upperThreshold = 3 * lowerThreshold;

    //Get Edges via Canny:
    cv::Canny( grayImage, testEdges, lowerThreshold, upperThreshold );

好了，这就是起点。这是我得到的边缘遮罩：

接近您的结果。现在，我将应用 dilation。在这里，操作的 iterations 的次数很重要，因为我想要漂亮的 thick 边缘。关闭打开的轮廓也是需要的，所以，我想要一个 mild-aggressive 膨胀。我使用矩形结构元素设置 iterations = 5 的数量。

    //Prepare a rectangular, 3x3 structuring element:
    cv::Mat SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(3, 3) );

    //OP iterations:
    int dilateIterations = 5;

   //Prepare the dilation matrix:
    cv::Mat binDilation;

   //Perform the morph operation:
    cv::morphologyEx( testEdges, binDilation, cv::MORPH_DILATE, SE, cv::Point(-1,-1), dilateIterations );

我明白了：

这是到目前为止的输出，边缘非常清晰。最重要的是清楚地定义立方体，因为稍后我将依靠它的轮廓来计算bounding rectangle。

接下来是我尝试尽可能准确清除立方体边缘的所有其他东西。如您所见，有很多不属于立方体的垃圾和像素。我对 flood-filling 颜色（白色）与立方体（黑色）不同的背景特别感兴趣，以便得到很好的分割。

但是，

Flood-filling 有一个缺点。如果不闭合，它也可以填充轮廓的内部。我尝试使用“边框蒙版”一次性清理垃圾和闭合轮廓，它们只是膨胀蒙版侧面的白线。

我将此蒙版实现为 四根超粗线，与扩张蒙版接壤。要应用线，我需要 starting 和 ending 点，它们对应于图像角。这些在 vector:

中定义

    std::vector< std::vector<cv::Point> > imageCorners;
    imageCorners.push_back( { cv::Point(0,0), cv::Point(binDilation.cols,0) } );
    imageCorners.push_back( { cv::Point(binDilation.cols,0), cv::Point(binDilation.cols, binDilation.rows) } );
    imageCorners.push_back( { cv::Point(binDilation.cols, binDilation.rows), cv::Point(0,binDilation.rows) } );
    imageCorners.push_back( { cv::Point(0,binDilation.rows), cv::Point(0, 0) } );

四个 starting/ending 坐标在一个包含四个条目的向量中。我应用“边框蒙版”循环遍历这些坐标并绘制粗线：

    //Define the SUPER THICKNESS:
    int lineThicness  = 200;

    //Loop through my line coordinates and draw four lines at the borders:
    for ( int c = 0 ; c < 4 ; c++ ){
        //Get current vector of points:
        std::vector<cv::Point> currentVect = imageCorners[c];
       //Get the starting/ending points:
        cv::Point startPoint = currentVect[0];
        cv::Point endPoint = currentVect[1];
        //Draw the line:
        cv::line( binDilation, startPoint, endPoint, cv::Scalar(255,255,255), lineThicness );
    }

酷。这让我得到这个输出：

现在，让我们应用 floodFill 算法。此操作将用“替代”颜色填充相同颜色像素的封闭区域。它需要一个种子点和替代颜色（在本例中为白色）。让我们 Flood-fill 在我们刚刚创建的白色面具内部的四个角处。

    //Set the offset of the image corners. Ensure the area to be filled is black:
    int fillOffsetX = 200;
    int fillOffsetY = 200;
    cv::Scalar fillTolerance = 0; //No tolerance
    int fillColor = 255; //Fill color is white
   
    //Get the dimensions of the image:
    int targetCols = binDilation.cols;
    int targetRows = binDilation.rows;

    //Flood-fill at the four corners of the image:
    cv::floodFill( binDilation, cv::Point( fillOffsetX, fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
    cv::floodFill( binDilation, cv::Point( fillOffsetX, targetRows - fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
    cv::floodFill( binDilation, cv::Point( targetCols - fillOffsetX, fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
    cv::floodFill( binDilation, cv::Point( targetCols - fillOffsetX, targetRows - fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);

这也可以作为一个循环来实现，就像“边框掩码”一样。在这个操作之后我得到这个掩码：

接近了，对吧？现在，根据你的形象，一些垃圾可以在所有这些“清理”操作中幸存下来。我建议应用。区域过滤器将删除阈值区域下的每个像素块。这很有用，因为立方体的斑点是蒙版上最大的斑点，它们肯定会在区域过滤器中幸存下来。

不管怎样，我只是对立方体的轮廓感兴趣；我不需要立方体内的那些线。我要 膨胀（倒置的）blob 的地狱 然后腐蚀回到原始尺寸以摆脱内部的线条立方体：

    //Get the inverted image:
    cv::Mat cubeMask = 255 - binDilation;

    //Set some really high iterations here:
    int closeIterations = 50;

    //Dilate
    cv::morphologyEx( cubeMask, cubeMask, cv::MORPH_DILATE, SE, cv::Point(-1,-1), closeIterations );
    //Erode:
    cv::morphologyEx( cubeMask, cubeMask, cv::MORPH_ERODE, SE, cv::Point(-1,-1), closeIterations );

这是一个关闭操作。一个相当残酷的，这是应用它的结果。记得我之前倒过图：

是不是很好或者什么？查看立方体蒙版，此处叠加到原始 RBG 图像中：

太好了，现在让我们得到这个 blob 的边界框。做法如下：

Get blob contour > Convert contour to bounding box

这实现起来相当简单，Python 等效项应该与此非常相似。首先，通过 findContours 获取轮廓。如您所见，应该只有一个 轮廓：立方体轮廓。接下来，使用 boundingRect 将轮廓转换为边界矩形。在 C++ 中，这是代码：

    //Lets get the blob contour:
    std::vector< std::vector<cv::Point> > contours;
    std::vector<cv::Vec4i> hierarchy;

    cv::findContours( cubeMask, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, cv::Point(0, 0) );

    //There should be only one contour, the item number 0:
    cv::Rect boundigRect = cv::boundingRect( contours[0] );

这些是找到的轮廓（只有一个）：

一旦你将这个轮廓转换为边界矩形，你就可以得到这个漂亮的图像：

啊，我们非常接近到这里结束了。由于所有正方形的尺寸都相同，而且你的图像看起来不是很perspective-distorted，我们可以使用边界矩形来估计正方形重新尺寸。所有正方形的宽度和高度都相同，每个立方体宽度有 3 个正方形，每个立方体高度有 3 个正方形。

将边界矩形分成 9 等份 sub-squares（或者，我称之为“网格”），并从边界框的坐标开始获取它们的尺寸和位置，如下所示：

    //Number of squares or "grids"
    int verticalGrids = 3;
    int horizontalGrids = 3;

    //Grid dimensions:
    float gridWidth = (float)boundigRect.width / 3.0;
    float gridHeight = (float)boundigRect.height / 3.0;

    //Grid counter:
    int gridCounter = 1;
    
    //Loop thru vertical dimension:
    for ( int j = 0; j < verticalGrids; ++j ) {

        //Grid starting Y:
        int yo = j * gridHeight;

        //Loop thru horizontal dimension:
        for ( int i = 0; i < horizontalGrids; ++i ) {

            //Grid starting X:
            int xo = i * gridWidth;
            
            //Grid dimensions:
            cv::Rect gridBox;
            gridBox.x = boundigRect.x + xo;
            gridBox.y = boundigRect.y + yo;
            gridBox.width = gridWidth;
            gridBox.height = gridHeight;

            //Draw a rectangle using the grid dimensions:
            cv::rectangle( testImage, gridBox, cv::Scalar(0,0,255), 5 );

            //Int to string:
            std::string gridCounterString = std::to_string( gridCounter );

            //String position:
            cv::Point textPosition;
            textPosition.x = gridBox.x + 0.5 * gridBox.width;
            textPosition.y = gridBox.y + 0.5 * gridBox.height;

            //Draw string:
            cv::putText( testImage, gridCounterString, textPosition, cv::FONT_HERSHEY_SIMPLEX,
                         1, cv::Scalar(255,0,0), 3, cv::LINE_8, false );

            gridCounter++;

        }

    }

在这里，对于每个网格，我都绘制了它的矩形，并在其中心绘制了一个漂亮的数字。绘制矩形函数需要定义一个矩形：左上角起始坐标和矩形宽度和高度，使用cv::Rect类型的gridBox变量定义。

这是一个很酷的动画，展示了如何将立方体分成 9 个网格：

这是最终图片！

一些建议：

你的源图太大了，把它缩小试试，操作
并缩小结果。
实施区域过滤器。摆脱小的非常方便像素点。
取决于你的图片（我刚刚测试了你发布在你的问题）和相机引入的透视失真，a 简单的 contour 到 boundingRect 可能还不够。在这种情况下，另一种方法是获得立方体轮廓的四个点通过 霍夫线检测.

如何在 OpenCV2 中将阈值拆分为正方形？

How do I split up thresholds into squares in OpenCV2?

python

opencv

rubiks-cube