如何从颜色推断形状的状态

Question

我有形成 4x4 形状的乐高立方体，我正在尝试推断图像内区域的状态：

empty/full and the color whether if yellow or Blue.

为了简化我的工作，我添加了红色标记来定义形状的边界，因为相机有时会晃动。
这是我试图检测的形状的清晰图像，由我的 phone 相机

( 编辑：请注意，此图像不是我的输入图像，它仅用于清楚地演示所需的形状)。

我应该使用的侧面摄像头的形状如下所示：

(编辑：现在这是我的输入图像)

为了将工作重点放在工作区，我创建了一个遮罩：

到目前为止我尝试的是按颜色定位红色标记（没有 HSV 颜色的简单阈值-space），如下所示：

import numpy as np
import matplotlib.pyplot as plt
import cv2

img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)

masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]

masked = cv2.medianBlur(masked,5)

plt.imshow(masked, cmap='gray')
plt.show()

到目前为止我已经发现了标记：

但我还是一头雾水：

如何精确地检测到所需区域的外部边界，以及红色标记内部的内部边界（每个乐高立方体（黄-蓝-绿）边界）？。

提前感谢您的宝贵建议。

Answer 1

我使用你未失真的图像测试了这种方法。假设您有经过校正的相机图像，因此您可以通过“鸟瞰”视角看到乐高积木。现在，想法是使用红色标记来估计中心矩形并裁剪图像的该部分。然后，因为你知道每块砖的尺寸（并且它们是常数），你可以追踪一个 grid 并提取每个 cell 的网格，你可以计算一些 HSV-based 掩码来估计每个网格上的主色，这样你就知道 space 是否被黄色或蓝色砖块占据，或者它是空的。

步骤如下：

获取 HSV 红色标记
使用每个标记通过每个标记的坐标

估计中心矩形

裁剪中心矩形
将矩形分成cells——这就是grid
运行在每个单元格上进行一系列 HSV-based 制作并计算主色
标签每个单元格的主色

我们来看代码：

# Importing cv2 and numpy:
import numpy as np
import cv2

# image path
path = "D://opencvImages//"
fileName = "Bg9iB.jpg"

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Store a deep copy for results:
inputCopy = inputImage.copy()

# Convert the image to HSV:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)

# The HSV mask values (Red):
lowerValues = np.array([127, 0, 95])
upperValues = np.array([179, 255, 255])

# Create the HSV mask
mask = cv2.inRange(hsvImage, lowerValues, upperValues)

第一部分非常简单。您设置 HSV 范围并使用 cv2.inRange 来获取目标颜色的二进制掩码。这是结果：

我们可以使用一些 morphology 进一步改进二进制掩码。让我们应用一个 closing 和稍微大一点的 structuring element 和 10 迭代。我们希望这些标记尽可能明确定义：

# Set kernel (structuring element) size:
kernelSize = 5
# Set operation iterations:
opIterations = 10
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)

产生：

非常好。现在，让我们检测此掩码上的 contours。我们将每个轮廓近似为 bounding box 并存储其起点和尺寸。这个想法是，虽然我们会检测到每个轮廓，但我们不确定它们的顺序。我们可以稍后 sort 这个列表，并从左到右，从上到下获取每个边界框，以更好地估计中心矩形。让我们检测 contours:

# Create a deep copy, convert it to BGR for results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)

# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

# Bounding Rects are stored here:
boundRectsList = []

# Process each contour 1-1:
for i, c in enumerate(contours):

    # Approximate the contour to a polygon:
    contoursPoly = cv2.approxPolyDP(c, 3, True)

    # Convert the polygon to a bounding rectangle:
    boundRect = cv2.boundingRect(contoursPoly)

    # Get the bounding rect's data:
    rectX = boundRect[0]
    rectY = boundRect[1]
    rectWidth = boundRect[2]
    rectHeight = boundRect[3]

    # Estimate the bounding rect area:
    rectArea = rectWidth * rectHeight

    # Set a min area threshold
    minArea = 100

    # Filter blobs by area:
    if rectArea > minArea:
        #Store the rect:
        boundRectsList.append(boundRect)

我还创建了一个 deep copy 蒙版图像以供进一步使用。主要是制作这张图片，是轮廓检测和bounding box approximation的结果：

请注意，我包含了一个最小面积条件。我想忽略低于 minArea 定义的某个阈值的噪声。好的，现在我们在 boundRectsList 变量中有了边界框。让我们使用 Y 坐标对这些框进行排序：

# Sort the list based on ascending y values:
boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])

现在列表已排序，我们可以从左到右，从上到下枚举框。像这样：First "row" -> 0, 1, Second "Row" -> 2, 3。现在，我们可以使用此信息定义大的中央矩形。我称这些为“内点”。请注意，矩形被定义为所有边界框的函数。例如，它的左上起点由 bounding box 0 的右下终点（x 和 y）定义。它的宽度由 bounding box 1 左下角的 x 坐标定义，高度由 bounding box 2 最右边的 y 坐标定义。我将遍历每个边界框并提取它们的相关尺寸以按以下方式构造中心矩形：(top left x, top left y, width, height)。实现这一目标的方法不止一种。我更喜欢使用 dictionary 来获取相关数据。让我们看看：

# Rectangle dictionary:
# Each entry is an index of the currentRect list
# 0 - X, 1 - Y, 2 - Width, 3 - Height
# Additionally: -1 is 0 (no dimension):
pointsDictionary = {0: (2, 3),
                    1: (-1, 3),
                    2: (2, -1),
                    3: (-1, -1)}

# Store center rectangle coordinates here:
centerRectangle = [None]*4

# Process the sorted rects:
rectCounter = 0

for i in range(len(boundRectsSorted)):

    # Get sorted rect:
    currentRect = boundRectsSorted[i]

    # Get the bounding rect's data:
    rectX = currentRect[0]
    rectY = currentRect[1]
    rectWidth = currentRect[2]
    rectHeight = currentRect[3]

    # Draw sorted rect:
    cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
                             int(rectY + rectHeight)), (0, 255, 0), 5)

    # Get the inner points:
    currentInnerPoint = pointsDictionary[i]
    borderPoint = [None]*2

    # Check coordinates:
    for p in range(2):
        # Check for '0' index:
        idx = currentInnerPoint[p]
        if idx == -1:
            borderPoint[p] = 0
        else:
            borderPoint[p] = currentRect[idx]

    # Draw the border points:
    color = (0, 0, 255)
    thickness = -1
    centerX = rectX + borderPoint[0]
    centerY = rectY + borderPoint[1]
    radius = 50
    cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)

    # Mark the circle
    org = (centerX - 20, centerY + 20)
    font = cv2.FONT_HERSHEY_SIMPLEX
    cv2.putText(maskCopy, str(rectCounter), org, font,
            2, (0, 0, 0), 5, cv2.LINE_8)

    # Show the circle:
    cv2.imshow("Sorted Rects", maskCopy)
    cv2.waitKey(0)

    # Store the coordinates into list
    if rectCounter == 0:
        centerRectangle[0] = centerX
        centerRectangle[1] = centerY
    else:
        if rectCounter == 1:
            centerRectangle[2] = centerX - centerRectangle[0]
        else:
            if rectCounter == 2:
                centerRectangle[3] = centerY - centerRectangle[1]
    # Increase rectCounter:
    rectCounter += 1

这张图片用红色圆圈显示了每个内部点。每个圆圈都是从左到右、从上到下枚举的。内部点存储在 centerRectangle 列表中：

如果您连接每个内部点，您将得到我们一直在寻找的中心矩形：

# Check out the big rectangle at the center:
bigRectX = centerRectangle[0]
bigRectY = centerRectangle[1]
bigRectWidth = centerRectangle[2]
bigRectHeight = centerRectangle[3]
# Draw the big rectangle:
cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
                     int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
cv2.imshow("Big Rectangle", maskCopy)
cv2.waitKey(0)

查看：

现在，只需裁剪原始图像的这一部分：

# Crop the center portion:
centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]

# Store a deep copy for results:
centerPortionCopy = centerPortion.copy()

这是图像的中心部分：

酷，现在让我们创建网格。您知道每个 width 必须有 4 个砖块，每个 height 个必须有 4 个砖块。我们可以使用此信息划分图像。我将每个子图像或单元格存储在一个列表中。我还在估计每个单元格的中心，以进行额外处理。这些也存储在列表中。来看看程序：

# Dive the image into a grid:
verticalCells = 4
horizontalCells = 4

# Cell dimensions
cellWidth = bigRectWidth / verticalCells
cellHeight = bigRectHeight / horizontalCells

# Store the cells here:
cellList = []

# Store cell centers here:
cellCenters = []

# Loop thru vertical dimension:
for j in range(verticalCells):

    # Cell starting y position:
    yo = j * cellHeight

    # Loop thru horizontal dimension:
    for i in range(horizontalCells):

        # Cell starting x position:
        xo = i * cellWidth

        # Cell Dimensions:
        cX = int(xo)
        cY = int(yo)
        cWidth = int(cellWidth)
        cHeight = int(cellHeight)

        # Crop current cell:
        currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]

        # into the cell list:
        cellList.append(currentCell)

        # Store cell center:
        cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))

        # Draw Cell
        cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)

    cv2.imshow("Grid", centerPortionCopy)
    cv2.waitKey(0)

这是网格：

现在让我们单独处理每个单元格。当然，您可以在最后一个循环中处理每个单元格，但我目前不寻求优化，清晰度是我的首要任务。我们需要生成一系列具有目标颜色的 HSV 蒙版：yellow、blue 和 green（空）。我更喜欢用目标颜色实现 dictionary。我将为每种颜色生成一个遮罩，并使用 cv2.countNonZero 计算白色像素的数量。同样，我设置了一个最低阈值。这次10。有了这个信息，我可以确定哪个蒙版产生了最大数量的白色像素，从而为我提供了主色：

# HSV dictionary - color ranges and color name:
colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
                   1: ([20, 64, 21], [30, 255, 255], "yellow"),
                   2: ([55, 64, 21], [92, 255, 255], "green")}

# Cell counter:
cellCounter = 0

for c in range(len(cellList)):

    # Get current Cell:
    currentCell = cellList[c]
    # Convert to HSV:
    hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)

    # Some additional info:
    (h, w) = currentCell.shape[:2]

    # Process masks:
    maxCount = 10
    cellColor = "None"

    for m in range(len(colorDictionary)):

        # Get current lower and upper range values:
        currentLowRange = np.array(colorDictionary[m][0])
        currentUppRange = np.array(colorDictionary[m][1])

        # Create the HSV mask
        mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)

        # Get max number of target pixels
        targetPixelCount = cv2.countNonZero(mask)
        if targetPixelCount > maxCount:
            maxCount = targetPixelCount
            # Get color name from dictionary:
            cellColor = colorDictionary[m][2]

    # Get cell center, add an x offset:
    textX = int(cellCenters[cellCounter][0]) - 100
    textY = int(cellCenters[cellCounter][1])

    # Draw text on cell's center:
    font = cv2.FONT_HERSHEY_SIMPLEX
    cv2.putText(centerPortion, cellColor, (textX, textY), font,
                    2, (0, 0, 255), 5, cv2.LINE_8)

    # Increase cellCounter:
    cellCounter += 1

    cv2.imshow("centerPortion", centerPortion)
    cv2.waitKey(0)

这是结果：

从这里可以很容易地识别网格上的空 spaces。我没有涵盖的是扭曲图像的透视校正，但是有很多关于如何做到这一点的信息。希望这对您有所帮助！

编辑：

如果您想将这种方法应用于扭曲的图像，您需要消除鱼眼和透视扭曲。您的校正图像应如下所示：

您可能需要调整一些值，因为即使在校正之后仍然存在一些失真。

如何从颜色推断形状的状态

How to infer the state of a shape from colors

python

opencv

image-processing

computer-vision

scikit-image