Pytesseract 增加Textdetection

Pytesseract increase Textdetection

我希望将以下车辆登记文件中的条目自动写入文本文件。

但是,文字识别非常困难。我尝试以不同的配置打开图像。我还测试了车辆登记文件的不同颜色级别。但是,我的 none 次尝试产生了可用的结果。

有没有人知道如何正确识别文本?

这是我尝试 ocr 的图像:

我使用的代码如下所示:

import cv2
import numpy as np
import pytesseract
import matplotlib.pyplot as plt
from PIL import Image
import regex

pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'

img = cv2.imread("Fahrzeugscheinsplit1.jpg")

result = pytesseract.image_to_string(img)
print(result)

我的输出如下所示:

|
08.05.2006)'| 8566) ADVOOOO1X
ne r pear
a BORD 7 aoe \
‘BWY i
QUBB1 Repieee ay a f
TRAC |
| = say, |
is Mondeo ath }
FO! s 1
Fz.2.Pers, +b. 8 Spl. .
Kombilimousine
vo) EURO 4
«| BURO 4 ) Re !
» Diesel ES
ll 0002. WW 0d62. l2198 |

首先,您应该了解 tesseract 的 image-processing 技巧。来自 official documentation you can apply simple-threshold.

如果应用简单的阈值处理,结果将是:

我认为我们应该将图像居中以便准确识别。我们可以通过添加边框来使图像居中:

图像已准备好用于 text-extraction,如果我们以 > 30 的置信度处理图像:

几乎检测到给定输入图像中的所有文本。我们还可以打印检测到的文本的值:

Detected Text: 08.05.2006
Detected Text: 8566!
Detected Text: M1
Detected Text: AC
Detected Text: 8
Detected Text: 6
Detected Text: FORD
Detected Text: BWY
Detected Text: SFHAP7
Detected Text: Mondeo
Detected Text: FORD
Detected Text: (D)
Detected Text: Pz.z.Pers.bef.b.
Detected Text: 8
Detected Text: Spl.
Detected Text: Kombilimousine
Detected Text: EURO
Detected Text: 4
Detected Text: EURO
Detected Text: 4
Detected Text: Diesel
Detected Text: 0002
Detected Text: 0462
Detected Text: 2198

使用简单的阈值处理,我们几乎找到了所有正确的值,对于缺失的部分,您可以使用降低置信度或增加阈值等值,或使用其他阈值方法,如 adaptive-thresholding or inRange-thresholding

代码:

from cv2 import imread, cvtColor, COLOR_BGR2GRAY as GRAY
from cv2 import imshow, waitKey, rectangle, threshold, THRESH_BINARY as BINARY
from cv2 import copyMakeBorder as addBorder, BORDER_CONSTANT as CONSTANT
from pytesseract import image_to_data, Output


bgr = imread("UXvS7.jpg")
gray = cvtColor(bgr, GRAY)
border = addBorder(gray, 50, 50, 50, 50, CONSTANT, value=255)
thresh = threshold(border, 150, 255, BINARY)[1]
data = image_to_data(thresh, output_type=Output.DICT)

for i in range(0, len(data["text"])):
    confidence = int(data["conf"][i])
    if confidence > 30:
        x = data["left"][i]
        y = data["top"][i]
        w = data["width"][i]
        h = data["height"][i]
        text = data["text"][i]
        print(f"Detected Text: {text}")
        rectangle(thresh, (x, y), (x + w, y + h), (0, 255, 0), 2)

imshow("", thresh)
waitKey(0)