Tesseract error in image_to_string() conversion: ytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')

Question

请注意：我知道有很多关于 Tesseract 的帖子。我还没有找到不会产生错误的有效解决方案。

我正在尝试使用 Tesseract 在图像上简单地使用 OCR。我在各种论坛上尝试过许多解决方案，但都没有成功。我已将 pdf 转换为图像并保存了所述图像。然后我使用 cv2 调用了这个图像。我也正要展示图像。现在，我正在尝试从 Tesseract 应用 image_to_string() 命令。

我已经尝试调整 pytesseract.pytesseract.tesseract_cmd 并确保安装了 wrapper 和 true tesseract 包。这是代码：

from wand.image import Image
import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:/Users/Afton/anaconda3/Scripts/pytesseract.exe'


# Convert from pdf and save as image
pdf = 'C:/path/example.pdf'
outputFilename = 'C:/path/example.jpg'

with Image(filename=pdf) as img:
    img.save(filename=outputFilename)

# Read image
imagePath = outputFilename
image = cv2.imread(imagePath)    

# Configure OCR with pytesseract
config = r'-l deu --oem 1 --psm 3'
text = pytesseract.image_to_string(image, config=config)

# Print text output
text = text.split('\n')
print(text)

这是当前错误：

pytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')

之前，错误与 pytesseract.pytesseract.tesseract_cmd 输入有关。

感谢任何帮助。

更新：图片是德文的。我试图在配置中阐明这一点。

Update2: 我尝试了 this resource 的替代路径（带有我的文件位置）

pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files/Tesseract-OCR/tesseract.exe'

我现在收到这个错误：

pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\Program Files\Tesseract-OCR/tessdata/deu.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'deu\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

有此问题的其他人注意：从https://github.com/tesseract-ocr/tessdata下载了语言包，因为我正在阅读德语文档。所有语言文件都可以在这里找到。问题是语言多样性。

Answer 1

这一行是错误的：

pytesseract.pytesseract.tesseract_cmd = r'C:/Users/Afton/anaconda3/Scripts/pytesseract.exe'

请阅读pytesseract documentation.

Tesseract error in image_to_string() conversion: ytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')

Tesseract error in image_to_string() conversion: ytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')

python

tesseract