在 Python / OpenCV 中从扫描的打印件中检测网格点

Question

我扫描了由不同喷墨打印机（爱普生、惠普、佳能等）打印的文档。每张照片都有非常高的质量（如 1,6GB），您可以放大并查看使用频率调制的图片的半色调。

我的任务是根据网格点、网格的图案、点的距离等进行特征提取

相关特征是这些点的大小（每台打印机打印这些点的大小不同 - 必须计算平均值和标准偏差）。

稍后我将不得不用 ML 训练一个模型，训练后的模型应该将打印件分类到特定打印机（所以基本上这个打印件属于打印机 XYZ）。

但现在我已经在特征工程和预处理方面苦苦挣扎，因为这实际上是我的第一个计算机视觉项目，我对 opencv 不太熟悉。

我有一个想法，我的计划是使用 opencv 对图像进行二进制转换，以通过 Sobel 或 Prewitt 过滤器或其他任何方式确定边缘（边缘检测）。所以我想我必须先模糊，然后再进行边缘检测？

我不确定这是否是正确的方法，所以我在这里问一下，你怎么看？如果您能给我一些最佳或好的方法的提示或步骤，我会很高兴。

Answer 1

这是 Python/OpenCV 中的一种方法。

使用 cv2.inRange() 的颜色阈值。在这种情况下，我将以蓝点为阈值。然后获取所有外部轮廓以找到所有孤立区域。根据等高线，计算等效圆直径。然后计算平均值和标准差。

输入：

import cv2
import numpy as np
import math

img = cv2.imread("color_dots.png")

# threshold on blue color
lower = (190,150,100)
upper = (255,255,170)
thresh = cv2.inRange(img, lower, upper)

# get external contours
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
count = len(contours)

sum = 0
sum2 = 0
for cntr in contours:
    # get area from contours and then diameters of equivalent circles
    area = cv2.contourArea(cntr)
    # area = pi*radius**2 = pi*(diameter/2)**2 = (pi/4)*diameter**2
    # diameter = sqrt(4*area/pi) = 2*sqrt(area/pi)
    diameter = 2 * math.sqrt(area/math.pi)
    sum = sum + diameter
    sum2 = sum2 + diameter * diameter

# compute average2 (mean)
average = sum/count
average2 = sum2/count

# compute standard deviation
variance = average2 - average*average
standard_deviation = math.sqrt(variance)

# print results
print("average:", average)
print("std_dev:", standard_deviation)

# save result
cv2.imwrite("color_dots_blue_threshold.png",thresh)

# display result
cv2.imshow("thresh", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

阈值图像：

结果：

average: 3.0747726858108635
std_dev: 0.541288251281962

在 Python / OpenCV 中从扫描的打印件中检测网格点

Detection of grid points from scanned prints in Python / OpenCV

python

opencv

computer-vision