由于无法找到 tesseract,Pytesseract 加载失败
Pytesseract failed to load due to it being unable to find tesseract
尝试在 windows 10 上使用 pytesseract 安装和使用 tesseract python 时出现错误:
File "C:\ProgramData\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
TesseractError: (1, 'Error opening data file \Program Files (x86)\Tesseract-OCR\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
我尝试重新安装 tesseract。
我已将 C:\Program Files (x86)\Tesseract-OCR 设置为 PATH 环境变量
我已将 TESSDATA_PREFIX 添加到 C:\Program Files (x86)\Tesseract-OCR\tessdata
我已经确认当我在 CMD 中输入 'tesseract' 时
我使用的代码:
import cv2
import pytesseract
# Uncomment the line below to provide path to tesseract manually
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
# Define config parameters.
# '-l eng' for using the English language
# '--oem 1' for using LSTM OCR Engine
config = ('-l eng --oem 1 --psm 3')
# Read image from disk
im = cv2.imread("Serie1/NL83LHL9.JPG", cv2.IMREAD_COLOR)
# Run tesseract OCR on image
text = pytesseract.image_to_string(im, config=config)
# Print recognized text
print(text)
结果:
CMD > tesseract : 显示 tesseract 界面
如果您的 PATH 中没有 tesseract 可执行文件,请包括以下内容:
pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files (x86)/Tesseract-OCR/tesseract'
由 Dmitrii Z 解决。
Indeed it looks a bit odd. One thing you can try is to add tessdata path to your config - config = r'--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\tessdata" -l eng --oem 1 --psm 3'
尝试在 windows 10 上使用 pytesseract 安装和使用 tesseract python 时出现错误:
File "C:\ProgramData\Anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
TesseractError: (1, 'Error opening data file \Program Files (x86)\Tesseract-OCR\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
我尝试重新安装 tesseract。 我已将 C:\Program Files (x86)\Tesseract-OCR 设置为 PATH 环境变量 我已将 TESSDATA_PREFIX 添加到 C:\Program Files (x86)\Tesseract-OCR\tessdata 我已经确认当我在 CMD 中输入 'tesseract' 时
我使用的代码:
import cv2
import pytesseract
# Uncomment the line below to provide path to tesseract manually
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
# Define config parameters.
# '-l eng' for using the English language
# '--oem 1' for using LSTM OCR Engine
config = ('-l eng --oem 1 --psm 3')
# Read image from disk
im = cv2.imread("Serie1/NL83LHL9.JPG", cv2.IMREAD_COLOR)
# Run tesseract OCR on image
text = pytesseract.image_to_string(im, config=config)
# Print recognized text
print(text)
结果:
CMD > tesseract : 显示 tesseract 界面
如果您的 PATH 中没有 tesseract 可执行文件,请包括以下内容:
pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files (x86)/Tesseract-OCR/tesseract'
由 Dmitrii Z 解决。
Indeed it looks a bit odd. One thing you can try is to add tessdata path to your config -
config = r'--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\tessdata" -l eng --oem 1 --psm 3'