tesseract-ocr 从具有字符协调的图像中读取文本

tesseract-ocr reading text from image with character cordination

我需要使用 Tesseract OCR 阅读文本,我需要从图像中获取字符位置有什么方法可以完成这些任务请帮助我

我得到了答案,我正在使用 Tesseract 和 hocr

hOCR is an open standard of data representation for formatted text obtained from optical character recognition. The definition encodes text, style, layout information, recognition confidence metrics and other information using Extensible Markup Language in the form of Hypertext Markup Language or XHTML

命令行语法类似于

tesseract someimage.jpg hocr