Pytesseract image_to_data 无法读取我图像中的数字

Question

所以我目前正在做一个项目，我使用 pyautogui 和 pytesseract 在我正在使用的视频游戏模拟器中截取时间的屏幕截图，然后尝试读取图像并确定我的时间得到。这是我使用 pyautogui 获取我想要的区域的屏幕截图时的图像：

当我测试它以确保它安装正确时，只是使用 pytesseract.image_to_string() 处理文本图像，但是当我使用游戏计时器图片时，它不会输出任何内容。这是否与图像质量或对 pytesseract 的某种模仿有关？

Answer 1

在使用 Pytesseract 执行 OCR 之前，您需要对图像进行预处理。这是使用 OpenCV 和 Pytesseract OCR 的简单方法。这个想法是获得一个处理过的图像，其中要提取的文本是黑色的，背景是白色的。为此，我们可以转换为 grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look 以获得更多选项。

输入图片

Otsu 获取二值图像的阈值

Pytesseract OCR 结果

0’ 12”92

代码

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Perform text extraction
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.waitKey()

Pytesseract image_to_data 无法读取我图像中的数字

Pytesseract image_to_data not able to read the numbers in my image

python

ocr

image

image-processing

python-tesseract