是否可以使用pytesseract从图像的特定部分提取文本

Question

我在图像中有边界框（矩形的坐标），想提取该坐标内的文本。如何使用 pytesseract 提取该坐标内的文本？

我尝试使用像

这样的 opencv 将图像部分复制到其他 numpyarray

cropped_image = image[y1:y2][x1:x2]

并尝试了 pytesseract.image_to_string()。但是准确性很差。但是当我尝试将原始图像转换为 pytesseract.image_to_string() 时，它完美地提取了所有内容..

是否有使用 pytesseract 提取图像特定部分的功能？

This image has different sections of information consider I have rectangle coordinates enclosing 'Online food delivering system' how to extract that data in pytessaract?

请帮忙提前致谢

我使用的版本： Tesseract 4.0.0 pytesseract 0.3.0 OpenCV 3.4.3

Answer 1

没有内置函数可以使用 Pytesseract 提取图像的特定部分，但我们可以使用 OpenCV 提取 ROI 边界框，然后将此 ROI 放入 Pytesseract。我们将图像转换为灰度，然后进行阈值处理以获得二值图像。假设你有想要的 ROI 坐标，我们使用 Numpy 切片来提取想要的 ROI

从这里我们把它扔进 Pytesseract 得到我们的结果

ONLINE FOOD DELIVERY SYSTEM

代码

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('1.jpg', 0)
thresh = 255 - cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

x,y,w,h = 37, 625, 309, 28  
ROI = thresh[y:y+h,x:x+w]
data = pytesseract.image_to_string(ROI, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('ROI', ROI)
cv2.waitKey()

是否可以使用pytesseract从图像的特定部分提取文本

Is it possible to extract text from specific portion of image using pytesseract

python

ocr

opencv

image

image-processing