从图像 pytesseract 中解析数字

Question

我正在尝试从图像中解析数字。这是图像的示例

我首先尝试提取所有文本以查看最终结果，但代码无法识别所需的数字这是我的尝试

from PyPDF2 import PdfFileWriter, PdfFileReader
import fitz, pytesseract, os, re
import cv2


def readNumber(img):
    img = cv2.imread(img)
    gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    txt = pytesseract.image_to_string(gry)
    return txt

我正在尝试解析第二行中斜线后的数字。这里的预期是 502630

这是 Ahx 的代码未能从中解析出数字的另一张示例图片

Answer 1

我认为您遗漏了 image processing 部分。

你可以申请adaptive thresholding.

例如：

现在，你想要22 / 502630，所以你需要检查/是否在当前行，如果该行包含'/'字符，则取右边的部分。

for line in text.split('\n'):
    if '/' in line:
        line = line.split('/')[1].split(' ')[0]
        print(line)

结果将是：

代码：

import cv2
import pytesseract

bgr_image = cv2.imread("C8EE6.png")
scaled_image = cv2.resize(bgr_image, (0, 0), fx=3, fy=3)
gray_image = cv2.cvtColor(scaled_image, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 61, 93)
text = pytesseract.image_to_string(thresh, config="psm 6")

for line in text.split('\n'):
    if '/' in line:
        line = line.split('/')[1].split(' ')[0]
        print(line)

对于第二张图片，如果我们应用前面的解决方案，结果将是：

我们需要为此解决方案使用不同的参数，因为输出的图像不是清晰的。我们需要更改常量 C 和 block size

来自documentation:

The blockSize determines the size of the neighborhood area and C is a constant that is subtracted from the mean or weighted sum of the neighborhood pixels.

如果我们设置blockSize=13和C=2，输出图像将是：

如果比较两张图片，后面的图片会比前面的图片更具可读性。现在，如果您阅读它：

更新代码：

import cv2
import pytesseract

bgr_image = cv2.imread("UVetb.png")
gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 13, 2)
text = pytesseract.image_to_string(thresh, config="psm 6")

for line in text.split('\n'):
    if '/' in line:
        line = line.split('/')[1].split(' ')[0]
        print(line)

更新后的代码是否适用于每张图片？

无法保证，因为每张图像都需要不同的块大小和 C 参数才能获得所需的结果。

从图像 pytesseract 中解析数字

Parse number from an image pytesseract

python

python-tesseract