Opencv OCR 改进了从具有背景的彩色图像中提取数据

Question

我正在尝试从手机屏幕截图中提取一些信息。虽然我的代码能够检索到一些信息，但不是全部。我阅读了转换为灰色的图像，然后删除了不需要的部分并应用了高斯阈值。但是整个文本都没有被阅读。

import numpy as np
import cv2
from PIL import Image
import matplotlib.pyplot as plt
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Installs\Tools\Tesseract-OCR\tesseract.exe'

image = "C:\Workspace\OCR\tesseract\rpstocks1 - Copy (2).png"
img = cv2.imread(image)
img_grey = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

height, width, channels = img.shape
print (height, width, channels)


rec_img=cv2.rectangle(img_grey,(30,100),(1040,704),(0,255,0),3).copy()

crop_img = rec_img[105:1945, 35:1035].copy()
cv2.medianBlur(img,5)
cv2.imwrite("C:\Workspace\OCR\tesseract\Cropped_GREY.jpg",crop_img)

img_gauss = cv2.adaptiveThreshold(crop_img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,12)
cv2.imwrite("C:\Workspace\OCR\tesseract\Cropped_Guass.jpg",img_gauss)
text = pytesseract.image_to_string(img_gauss, lang='eng')
text.encode('utf-8')
print(text)

输出

图像尺寸 704 1080 3

投资

,712.99 
ASRT _ 0
500.46 shares  ......... ..  /0 
GNUS 
25169 Shares  """"" " ‘27.98%

rpstocks1 - 复制 (2).png Cropped_GREY.jpg Cropped_Guass.jpg

Answer 1

看看pytesseract的页面分割模式，cf。。例如，使用 config='-psm 12' 将已经为您提供所有需要的文本。然而，这些图表也以某种方式被解释为文本。

这就是为什么我会对图像进行预处理以获得单个框（实际文本、图表、来自顶部的信息等），并进行过滤以仅存储那些包含感兴趣内容的框。这可以通过使用

来完成

边框的y坐标（不在图片的上5%，就是移动phone状态栏），
边界矩形的宽度w（不超过图像宽度的 50%，这些是水平线），
边界矩形的 x 坐标（不在图像的中间三分之一处，这些是图表）。

剩下的就是运行 pytesseract 在每个裁剪后的图像上 config='-psm 6' 例如（假设一个统一的文本块） , 并清除所有换行符中的文本。

那是我的代码：

import cv2
import pytesseract

# Read image
img = cv2.imread('cUcby.png')
hi, wi = img.shape[:2]

# Convert to grayscale for tesseraact
img_grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Mask single boxes by thresholding and morphological closing in x diretion
mask = cv2.threshold(img_grey, 248, 255, cv2.THRESH_BINARY_INV)[1]
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE,
                        cv2.getStructuringElement(cv2.MORPH_RECT, (51, 1)))

# Find contours w.r.t. the OpenCV version
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Get bounding rectangles
rects = [cv2.boundingRect(cnt) for cnt in cnts]

# Filter bounding rectangles:
# - not in the upper 5 % of the image (mobile phone status bar)
# - not wider than 50 % of the image' width (horizontal lines)
# - not being in the middle third of the image (graphs)
rects = [(x, y, w, h) for x, y, w, h in rects if
         (y > 0.05 * hi) and
         (w <= 0.5 * wi) and
         ((x < 0.3333 * wi) or (x > 0.6666 * wi))]

# Sort bounding rectangles first by y coordinate, then by x coordinate
rects = sorted(rects, key=lambda x: (x[1], x[0]))

# Get texts from bounding rectangles from pytesseract
texts = [pytesseract.image_to_string(
    img_grey[y-1:y+h+1, x-1:x+w+1], config='-psm 6') for x, y, w, h in rects]

# Remove line breaks
texts = [text.replace('\n', '') for text in texts]

# Output
print(texts)

这就是输出：

['Investing', ',712.99', 'ASRT', '-27.64%', '500.46 shares', 'GNUS', '-27.98%', '251.69 shares']

由于您知道边界矩形的位置，您还可以使用该信息重新排列整个文本。

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.16299-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.1
pytesseract:   4.00.00alpha
----------------------------------------

Opencv OCR 改进了从具有背景的彩色图像中提取数据

Open CV OCR improve data extraction from color image with background

python

opencv

machine-learning

computer-vision

python-tesseract