安装 tesseract-ocr 时出错
Error while installing tesseract-ocr
我想将 pytesseract 用于 ocr。所以安装了它。但在此之前我需要安装 tesseract-ocr。我正在使用 windows 8.1。我打开命令行和 运行 命令 pip install tesseract-ocr。以下几行是该命令的结果。
我不明白这里发生了什么。我如何理解这一点并帮助我在我的电脑上成功安装 tesseract?
C:\Users\HarshLaptop>pip install tesseract-ocr
Collecting tesseract-ocr
Using cached https://files.pythonhosted.org/packages/e2/0d/dcee3dd0fc4c7bcd181
25a98f8ba6d9db7aecaa40770595203e312649587/tesseract-ocr-0.0.1.tar.gz
Requirement already satisfied: cython in c:\users\harshlaptop\anaconda3\lib\site
-packages (from tesseract-ocr) (0.25.2)
Building wheels for collected packages: tesseract-ocr
Running setup.py bdist_wheel for tesseract-ocr ... error
Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c "
import setuptools, tokenize;__file__='C:\Users\HARSHL~1\AppData\Local\Temp\
\pip-install-x8nz3uhm\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', open
)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __f
ile__, 'exec'))" bdist_wheel -d C:\Users\HARSHL~1\AppData\Local\Temp\pip-wheel-s
j29zfyo --python-tag cp36:
running bdist_wheel
running build
running build_py
file tesseract_ocr.py (for module tesseract_ocr) not found
file tesseract_ocr.py (for module tesseract_ocr) not found
running build_ext
building 'tesseract_ocr' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c
/nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic:\
users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual S
tudio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\include.0.10
240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits.1\include\shared" "-IC:\Pro
gram Files (x86)\Windows Kits.1\include\um" "-IC:\Program Files (x86)\Windows
Kits.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6\R
elease\tesseract_ocr.obj
tesseract_ocr.cpp
tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'leptonic
a/allheaders.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN
\x86_amd64\cl.exe' failed with exit status 2
----------------------------------------
Failed building wheel for tesseract-ocr
Running setup.py clean for tesseract-ocr
Failed to build tesseract-ocr
Installing collected packages: tesseract-ocr
Running setup.py install for tesseract-ocr ... error
Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c
"import setuptools, tokenize;__file__='C:\Users\HARSHL~1\AppData\Local\Tem
p\pip-install-x8nz3uhm\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', op
en)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, _
_file__, 'exec'))" install --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-rec
ord-vnlr99lk\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
file tesseract_ocr.py (for module tesseract_ocr) not found
file tesseract_ocr.py (for module tesseract_ocr) not found
running build_ext
building 'tesseract_ocr' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe
/c /nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic
:\users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual
Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\include.0.
10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits.1\include\shared" "-IC:\P
rogram Files (x86)\Windows Kits.1\include\um" "-IC:\Program Files (x86)\Window
s Kits.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6
\Release\tesseract_ocr.obj
tesseract_ocr.cpp
tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'lepton
ica/allheaders.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\B
IN\x86_amd64\cl.exe' failed with exit status 2
----------------------------------------
Command "c:\users\harshlaptop\anaconda3\python.exe -u -c "import setuptools, tok
enize;__file__='C:\Users\HARSHL~1\AppData\Local\Temp\pip-install-x8nz3uhm\
\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.rea
d().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" insta
ll --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-record-vnlr99lk\install-rec
ord.txt --single-version-externally-managed --compile" failed with error code 1
in C:\Users\HARSHL~1\AppData\Local\Temp\pip-install-x8nz3uhm\tesseract-ocr\`enter code here`
您需要安装leptonica.Tesseract需要它。
我遇到了完全相同的问题。使用 Visual studio 2017,在 windows 10 台机器上安装 python 3.6。对我有用的是:
- 从 https://github.com/UB-Mannheim/tesseract/wiki 下载并安装 tesseract-ocr 可执行文件(脚本假定
运行 来自 windows 系统并将 tesseract 安装保存到
默认位置建议即C:\程序文件
(x86)\Tesseract-OCR) 请参阅
https://github.com/tesseract-ocr/tesseract/wiki 获取更多信息
在不同的 OS 类型(包括 windows)上安装,使用
预构建的二进制包。
- 确保您已安装 Python 图像库 ('PIL') 或 'pillow' 软件包以打开图像。 (安装 PIL 在我的系统中不起作用
设置但枕头没有,即 pip 安装枕头)。你需要的理由
这是因为 pytesseract 需要它。看
https://pypi.org/project/pytesseract/0.2.5/ 了解更多信息。
然后要在代码中成功使用它,只需在代码中设置 tesseract_cmd 路径,如下所示:
from PIL import Image
import pytesseract
try:
img = Image.open(path/to/image.png)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'
text = pytesseract.image_to_string(path/to/image.png)
Print(text)
希望对您有所帮助。
为了安装 leptonica,您需要遵循此 link。
conda install -c conda-forge leptonica
但是,为了消除安装时的错误,这根本不是一个完整的解决方案 tesseract-ocr。
您需要使用 windows 可用的安装程序 here 安装 tesseract。那么你应该将 python 包装器安装为:
pip install pytesseract
最后但同样重要的是,您还应该在导入 pytesseract 库后在脚本中设置 tesseract 路径,如下所示(请不要忘记安装路径可能会根据您的情况进行修改!):
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'
我想将 pytesseract 用于 ocr。所以安装了它。但在此之前我需要安装 tesseract-ocr。我正在使用 windows 8.1。我打开命令行和 运行 命令 pip install tesseract-ocr。以下几行是该命令的结果。
我不明白这里发生了什么。我如何理解这一点并帮助我在我的电脑上成功安装 tesseract?
C:\Users\HarshLaptop>pip install tesseract-ocr
Collecting tesseract-ocr
Using cached https://files.pythonhosted.org/packages/e2/0d/dcee3dd0fc4c7bcd181
25a98f8ba6d9db7aecaa40770595203e312649587/tesseract-ocr-0.0.1.tar.gz
Requirement already satisfied: cython in c:\users\harshlaptop\anaconda3\lib\site
-packages (from tesseract-ocr) (0.25.2)
Building wheels for collected packages: tesseract-ocr
Running setup.py bdist_wheel for tesseract-ocr ... error
Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c "
import setuptools, tokenize;__file__='C:\Users\HARSHL~1\AppData\Local\Temp\
\pip-install-x8nz3uhm\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', open
)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __f
ile__, 'exec'))" bdist_wheel -d C:\Users\HARSHL~1\AppData\Local\Temp\pip-wheel-s
j29zfyo --python-tag cp36:
running bdist_wheel
running build
running build_py
file tesseract_ocr.py (for module tesseract_ocr) not found
file tesseract_ocr.py (for module tesseract_ocr) not found
running build_ext
building 'tesseract_ocr' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c
/nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic:\
users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual S
tudio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\include.0.10
240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits.1\include\shared" "-IC:\Pro
gram Files (x86)\Windows Kits.1\include\um" "-IC:\Program Files (x86)\Windows
Kits.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6\R
elease\tesseract_ocr.obj
tesseract_ocr.cpp
tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'leptonic
a/allheaders.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN
\x86_amd64\cl.exe' failed with exit status 2
----------------------------------------
Failed building wheel for tesseract-ocr
Running setup.py clean for tesseract-ocr
Failed to build tesseract-ocr
Installing collected packages: tesseract-ocr
Running setup.py install for tesseract-ocr ... error
Complete output from command c:\users\harshlaptop\anaconda3\python.exe -u -c
"import setuptools, tokenize;__file__='C:\Users\HARSHL~1\AppData\Local\Tem
p\pip-install-x8nz3uhm\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', op
en)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, _
_file__, 'exec'))" install --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-rec
ord-vnlr99lk\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
file tesseract_ocr.py (for module tesseract_ocr) not found
file tesseract_ocr.py (for module tesseract_ocr) not found
running build_ext
building 'tesseract_ocr' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe
/c /nologo /Ox /W3 /GL /DNDEBUG /MD -Ic:\users\harshlaptop\anaconda3\include -Ic
:\users\harshlaptop\anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual
Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\include.0.
10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits.1\include\shared" "-IC:\P
rogram Files (x86)\Windows Kits.1\include\um" "-IC:\Program Files (x86)\Window
s Kits.1\include\winrt" /EHsc /Tptesseract_ocr.cpp /Fobuild\temp.win-amd64-3.6
\Release\tesseract_ocr.obj
tesseract_ocr.cpp
tesseract_ocr.cpp(463): fatal error C1083: Cannot open include file: 'lepton
ica/allheaders.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\B
IN\x86_amd64\cl.exe' failed with exit status 2
----------------------------------------
Command "c:\users\harshlaptop\anaconda3\python.exe -u -c "import setuptools, tok
enize;__file__='C:\Users\HARSHL~1\AppData\Local\Temp\pip-install-x8nz3uhm\
\tesseract-ocr\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.rea
d().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" insta
ll --record C:\Users\HARSHL~1\AppData\Local\Temp\pip-record-vnlr99lk\install-rec
ord.txt --single-version-externally-managed --compile" failed with error code 1
in C:\Users\HARSHL~1\AppData\Local\Temp\pip-install-x8nz3uhm\tesseract-ocr\`enter code here`
您需要安装leptonica.Tesseract需要它。
我遇到了完全相同的问题。使用 Visual studio 2017,在 windows 10 台机器上安装 python 3.6。对我有用的是:
- 从 https://github.com/UB-Mannheim/tesseract/wiki 下载并安装 tesseract-ocr 可执行文件(脚本假定 运行 来自 windows 系统并将 tesseract 安装保存到 默认位置建议即C:\程序文件 (x86)\Tesseract-OCR) 请参阅 https://github.com/tesseract-ocr/tesseract/wiki 获取更多信息 在不同的 OS 类型(包括 windows)上安装,使用 预构建的二进制包。
- 确保您已安装 Python 图像库 ('PIL') 或 'pillow' 软件包以打开图像。 (安装 PIL 在我的系统中不起作用 设置但枕头没有,即 pip 安装枕头)。你需要的理由 这是因为 pytesseract 需要它。看 https://pypi.org/project/pytesseract/0.2.5/ 了解更多信息。
然后要在代码中成功使用它,只需在代码中设置 tesseract_cmd 路径,如下所示:
from PIL import Image import pytesseract try: img = Image.open(path/to/image.png) pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract' text = pytesseract.image_to_string(path/to/image.png) Print(text)
希望对您有所帮助。
为了安装 leptonica,您需要遵循此 link。
conda install -c conda-forge leptonica
但是,为了消除安装时的错误,这根本不是一个完整的解决方案 tesseract-ocr。
您需要使用 windows 可用的安装程序 here 安装 tesseract。那么你应该将 python 包装器安装为:
pip install pytesseract
最后但同样重要的是,您还应该在导入 pytesseract 库后在脚本中设置 tesseract 路径,如下所示(请不要忘记安装路径可能会根据您的情况进行修改!):
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'