将图像切成条带以进行 OCR

Question

我正在尝试将 PNG 分割成一系列条带，因此我可以使用 tesseract 准确读取银行对帐单的每一行（如果已经读取银行对帐单，请告诉我 <3）。

我已经开始使用 image slicer 库，它只切片成图块。

import image_slicer
from PIL import Image

pages = convert_from_path('OCRBeebun/Bacon.pdf', 500)
i = 0
firstRun = False

if firstRun:
    for page in pages:

        page.save("img_"+str(i)+'.png','PNG')
        i+=1


image_slicer.slice("img_0.png",14)

有什么想法吗？

Answer 1

如果您使用 cv2.imread() 打开图像，那么您可以像切片字符串一样切片图像（您还必须为此导入 numpy），例如，

import numpy
import cv2

img = cv2.imread('path-to-image.png', cv2.IMREAD_GRAYSCALE) # you can have that flag if u need it tho
# if you have the coordinates to slice the image, then
cropped_img = img[height_start:height_end, width_start:width_end]
# You can also run the above line in a loop to get more than one sliced image

# To save the sliced image
cv2.imwrite('cropped_image.png', cropped_img)

在 height_start、height_end、width_start 和 width_end

中输入您的值

将图像切成条带以进行 OCR

Slicing an image into strips for OCR

python

ocr

tesseract

image