为 OCR 处理带有正楷字母的图像

Question

我想对游戏中的战斗日志进行 OCR：

战斗日志图片

原图有块状文字字体，所以在阈值处理后我反转了颜色。现在我想删除 "black" 背景（但不是黑色文本），但我不确定如何在 OpenCV 中实现它。之后我想我想将文本锐化为更粗以获得更好的 OCR。

请问我该怎么办？

Answer 1

试试这个。

import cv2
import numpy as np

img = cv2.imread("1.png")

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

invert0 = cv2.bitwise_not(gray)

_,thresh = cv2.threshold(invert0,128,255,cv2.THRESH_BINARY)

invert1 = cv2.bitwise_not(thresh)

im2, contours, hierarchy = cv2.findContours(invert1,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

mask = np.zeros(gray.shape, dtype="uint8")
for i in range(len(contours)):
    if(hierarchy[0][i][3]==-1): #contour has no parent (most outer contour)
        cv2.fillPoly(mask, pts =[contours[i]], color=255)

invert2 = cv2.bitwise_not(mask)
res = invert2 + invert1

cv2.imshow("img", img)    
cv2.imshow("gray", gray)   
cv2.imshow("invert0", invert0) 
cv2.imshow("thresh", thresh) 
cv2.imshow("invert1", invert1) 
cv2.imshow("invert2", invert2)
cv2.imshow("mask", mask)
cv2.imshow("res", res)

cv2.waitKey()
cv2.destroyAllWindows()

为 OCR 处理带有正楷字母的图像

Processing an image with block letters for OCR

opencv

tesseract