pytesseract image_to_string 函数的无关输出

Question

我正在尝试从图像中提取文本，但 pytesseract 给出了完全不同的输出，输出下方附加的图像是“Werle”（完全不同的单词和字符），我尝试了许多不同的处理方法，例如图像增强, rgb2gray, rgb2binary, 还是不行。令我困惑的是，图片中的文字非常清晰明了。我还尝试将笔记本从 google colab 更改为我的本地笔记本并检查了库版本，但结果同样不正确。

输出>>“Werle”

这是我的代码：-

ret,frame = cap.read()
crop_img = frame[320:400,430:840]
text = pt.image_to_string(crop_img)

注意：这个问题出现在其他风格相同但文字不同的图片上

Answer 1

事实证明，Pytesseract 是在白色背景和黑色文本的数据上训练的，所以我所做的就是将黑色像素变为白色，将白色像素变为黑色

    crop_img = frame[320:400,430:840]

    lower_black = np.array([0,0,0], dtype = "uint16")
    upper_black = np.array([200,200,200], dtype = "uint16")
    crop_img = cv2.inRange(crop_img, lower_black, upper_black)

    text = pt.image_to_string(image=crop_img)

并且它与此预处理一起正常工作。

pytesseract image_to_string 函数的无关输出

Unrelated output by pytesseract image_to_string function

opencv

tesseract

computer-vision

deep-learning

cv2