Pytesseract - 对带有不同颜色文本的图像进行 OCR

Question

当文本以不同颜色显示时，Pytesseract 无法提取文本。我尝试使用 opencv 反转图像，但它不适用于深色文本颜色。

图片：

import cv2
import pytesseract

from PIL import Image


def text(image):
    image = cv2.resize(image, (0, 0), fx=7, fy=7)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    cv2.imwrite("gray.png", gray)

    blur = cv2.GaussianBlur(gray, (3, 3), 0)
    cv2.imwrite("gray_blur.png", blur)

    thresh = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    cv2.imwrite("thresh.png", thresh)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
    cv2.imwrite("opening.png", opening)

    invert = 255 - opening
    cv2.imwrite("invert.png", invert)

    data = pytesseract.image_to_string(invert, lang="eng", config="--psm 7")
    return data

有没有办法从给定图像中提取文本：DEADLINE（红色）和 WHITE HOUSE（白色）

Answer 1

您可以使用 ImageOps 反转 image.And 将图像二值化。

import pytesseract
from PIL import Image,ImageOps
import numpy as np

img = Image.open("OCR.png").convert("L")
img = ImageOps.invert(img)
# img.show()
threshold = 240
table = []
pixelArray = img.load()
for y in range(img.size[1]):  # binaryzate it
    List = []
    for x in range(img.size[0]):
        if pixelArray[x,y] < threshold:
            List.append(0)
        else:
            List.append(255)
    table.append(List)

img = Image.fromarray(np.array(table)) # load the image from array.
# img.show()

print(pytesseract.image_to_string(img))

结果：

img到底是这样的：

Pytesseract - 对带有不同颜色文本的图像进行 OCR

Pytesseract - OCR on image with text in different colors

python

ocr

opencv

python-imaging-library

python-tesseract