TesseractNotFoundError: tesseract is not installed or it's not in your path
TesseractNotFoundError: tesseract is not installed or it's not in your path
我正在尝试使用 tesseract-OCR 从图像中打印文本。但我收到上述错误。我已经使用 pip install pytesseract 在 anaconda 提示符下使用 https://github.com/UB-Mannheim/tesseract/wiki 和 pytesseract 安装了 tesseract OCR,但它不起作用。如果有人遇到类似问题,请帮助。
Collecting pytesseract
Downloading https://files.pythonhosted.org/packages/13/56/befaafbabb36c03e4fdbb3fea854e0aea294039308a93daf6876bf7a8d6b/pytesseract-0.2.4.tar.gz (169kB)
100% |████████████████████████████████| 174kB 288kB/s
Requirement already satisfied: Pillow in c:\users0066016\appdata\local\continuum\anaconda3\lib\site-packages (from pytesseract) (5.1.0)
Building wheels for collected packages: pytesseract
Running setup.py bdist_wheel for pytesseract ... done
Stored in directory: C:\Users0066016\AppData\Local\pip\Cache\wheels\a8[=10=]c[=10=]e4957a46128bea34fda60b8b01a8755986415cbab3ed8e38
Successfully built pytesseract
代码如下:
import pytesseract
import cv2
import numpy as np
def get_string(img_path):
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1,1), np.uint8)
dilate = cv2.dilate(img, kernel, iterations=1)
erosion = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('removed_noise.jpg', img)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
cv2.imwrite('thresh.jpg', img)
res = pytesseract.image_to_string('thesh.jpg')
return res
print('Getting string from the image')
print(get_string('quotes.jpg'))
错误如下:
Traceback (most recent call last):
File "<ipython-input-2-cf6e0fca14b4>", line 1, in <module>
runfile('C:/Users/500066016/.spyder-py3/project1.py', wdir='C:/Users/500066016/.spyder-py3')
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/500066016/.spyder-py3/project1.py", line 23, in <module>
print(get_string('quotes.jpg'))
File "C:/Users/500066016/.spyder-py3/project1.py", line 20, in get_string
res = pytesseract.image_to_string('thesh.jpg')
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 172, in run_tesseract
raise TesseractNotFoundError()
TesseractNotFoundError: tesseract is not installed or it's not in your path```
第 1 步:从此 link 下载并安装 Tesseract OCR。
第二步:安装后找到“Tesseract-OCR”文件夹,双击该文件夹,找到tesseract.exe.
第三步:找到tesseract.exe后,复制文件位置
第 4 步:像这样将此位置传递到您的代码中
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
注意:C:\Program Files\Tesseract-OCR\tesseract.exe == 你复制的位置
您应该安装:
! apt install tesseract-ocr
! apt install libtesseract-dev
和
! pip install Pillow
! pip install pytesseract
import pytesseract
from PIL import ImageEnhance, ImageFilter, Image
我有从 google 开车到 运行 的可乐代码。下面是我的示例代码:
我拍了网站上的任何文字示例图片
第 1 步:导入一些包
import pytesseract
import cv2
import matplotlib.pyplot as plt
from PIL import Image
第 2 步:在 Colab
上上传 text.png 的文件
from google.colab import files
uploaded = files.upload()
current browser session. Please rerun this cell to enable.
---------------------------------------------------------------------------
MessageError Traceback (most recent call last)
<ipython-input-31-21dc3c638f66> in <module>()
1 from google.colab import files
----> 2 uploaded = files.upload()
2 frames
/usr/local/lib/python3.6/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
104 reply.get('colab_msg_id') == message_id):
105 if 'error' in reply:
--> 106 raise MessageError(reply['error'])
107 return reply.get('data', None)
108
MessageError: TypeError: Cannot read property '_uploadFiles' of undefined
-> 别担心,请再次 运行 代码它会接受它。然后,你可以选择上传哪个
第 3 步:
使用 OpenCV 读取图像
图片=cv2.imread("text.png")
或者您可以使用 Pillow
图片=Image.open("text.png")
检查一下。他们有显示文件文字图片吗。
图片
获取字符串
string = pytesseract.image_to_string(image)
打印出来
print(string)
完成。
我正在尝试使用 tesseract-OCR 从图像中打印文本。但我收到上述错误。我已经使用 pip install pytesseract 在 anaconda 提示符下使用 https://github.com/UB-Mannheim/tesseract/wiki 和 pytesseract 安装了 tesseract OCR,但它不起作用。如果有人遇到类似问题,请帮助。
Collecting pytesseract
Downloading https://files.pythonhosted.org/packages/13/56/befaafbabb36c03e4fdbb3fea854e0aea294039308a93daf6876bf7a8d6b/pytesseract-0.2.4.tar.gz (169kB)
100% |████████████████████████████████| 174kB 288kB/s
Requirement already satisfied: Pillow in c:\users0066016\appdata\local\continuum\anaconda3\lib\site-packages (from pytesseract) (5.1.0)
Building wheels for collected packages: pytesseract
Running setup.py bdist_wheel for pytesseract ... done
Stored in directory: C:\Users0066016\AppData\Local\pip\Cache\wheels\a8[=10=]c[=10=]e4957a46128bea34fda60b8b01a8755986415cbab3ed8e38
Successfully built pytesseract
代码如下:
import pytesseract
import cv2
import numpy as np
def get_string(img_path):
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1,1), np.uint8)
dilate = cv2.dilate(img, kernel, iterations=1)
erosion = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('removed_noise.jpg', img)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
cv2.imwrite('thresh.jpg', img)
res = pytesseract.image_to_string('thesh.jpg')
return res
print('Getting string from the image')
print(get_string('quotes.jpg'))
错误如下:
Traceback (most recent call last):
File "<ipython-input-2-cf6e0fca14b4>", line 1, in <module>
runfile('C:/Users/500066016/.spyder-py3/project1.py', wdir='C:/Users/500066016/.spyder-py3')
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/500066016/.spyder-py3/project1.py", line 23, in <module>
print(get_string('quotes.jpg'))
File "C:/Users/500066016/.spyder-py3/project1.py", line 20, in get_string
res = pytesseract.image_to_string('thesh.jpg')
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users0066016\AppData\Local\Continuum\anaconda3\lib\site-packages\pytesseract\pytesseract.py", line 172, in run_tesseract
raise TesseractNotFoundError()
TesseractNotFoundError: tesseract is not installed or it's not in your path```
第 1 步:从此 link 下载并安装 Tesseract OCR。
第二步:安装后找到“Tesseract-OCR”文件夹,双击该文件夹,找到tesseract.exe.
第三步:找到tesseract.exe后,复制文件位置
第 4 步:像这样将此位置传递到您的代码中
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
注意:C:\Program Files\Tesseract-OCR\tesseract.exe == 你复制的位置
您应该安装:
! apt install tesseract-ocr
! apt install libtesseract-dev
和
! pip install Pillow
! pip install pytesseract
import pytesseract
from PIL import ImageEnhance, ImageFilter, Image
我有从 google 开车到 运行 的可乐代码。下面是我的示例代码:
我拍了网站上的任何文字示例图片
第 1 步:导入一些包
import pytesseract
import cv2
import matplotlib.pyplot as plt
from PIL import Image
第 2 步:在 Colab
上上传 text.png 的文件from google.colab import files
uploaded = files.upload()
current browser session. Please rerun this cell to enable.
---------------------------------------------------------------------------
MessageError Traceback (most recent call last)
<ipython-input-31-21dc3c638f66> in <module>()
1 from google.colab import files
----> 2 uploaded = files.upload()
2 frames
/usr/local/lib/python3.6/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
104 reply.get('colab_msg_id') == message_id):
105 if 'error' in reply:
--> 106 raise MessageError(reply['error'])
107 return reply.get('data', None)
108
MessageError: TypeError: Cannot read property '_uploadFiles' of undefined
-> 别担心,请再次 运行 代码它会接受它。然后,你可以选择上传哪个
第 3 步:
使用 OpenCV 读取图像
图片=cv2.imread("text.png")
或者您可以使用 Pillow
图片=Image.open("text.png")
检查一下。他们有显示文件文字图片吗。
图片
获取字符串
string = pytesseract.image_to_string(image)
打印出来
print(string)
完成。