我可以让 pytesseract 命令在抛出错误的 pycharm 中正常工作吗
Can I get pytesseract command to work properly in pycharm which is throwing errors
我正在定义一个将图像转换为灰度(位黑白色)的函数,然后将其传递给:
text = pytesseract.image_to_string(Image.open(gray_scale_image))
然后我打印我收到的文本,但它抛出错误:
Traceback (most recent call last):
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\PIL\Image.py", line 2613, in open
fp.seek(0)
AttributeError: 'numpy.ndarray' object has no attribute 'seek'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/HP/PycharmProjects/nayaproject/new.py", line 17, in <module>
text = pytesseract.image_to_string(Image.open(g))
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\PIL\Image.py", line 2615, in open
fp = io.BytesIO(fp.read())
AttributeError: 'numpy.ndarray' object has no attribute 'read'
而不是 Image.open(灰度),当我使用 Image.fromarray(灰度)时,我得到了这些错误:
Traceback (most recent call last):
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 170, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/HP/PycharmProjects/nayaproject/new.py", line 17, in <module>
text = pytesseract.image_to_string(Image.fromarray(g))
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 172, in run_tesseract
raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
我正在 PyCharm,我已经为这个项目安装了 Pillow、numpy、opencv-python、pip 和 pytesseract。
因为我猜 gray_scale_image 是从 OpenCV 输出的,因此是错误提示的 numpy 数组
AttributeError: 'numpy.ndarray' object has no attribute 'read'
您需要将数组转换为 PIL 对象。根据我自己的经验,我建议您自动将 numpy 数组转换为 np.uint8,因为 PIL 使用 8 位并且您通常不了解 OpenCV 算法的内容。
text = pytesseract.image_to_string(Image.fromarray(gray_scale_image.astype(np.uint8)))
如果上面提到的不起作用,你绝对不要传递任何形式的图像数组。尝试键入这些以查找 arguemnt 的字符:
print(type(gray_scale_image))
print(gray_scale_image.shape)
这将解决您的第一个问题后,还会出现您还不知道的新问题。您需要将路径添加到您的 pytesseract
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
解决方法是在开头加上你的路径
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR'
我正在定义一个将图像转换为灰度(位黑白色)的函数,然后将其传递给:
text = pytesseract.image_to_string(Image.open(gray_scale_image))
然后我打印我收到的文本,但它抛出错误:
Traceback (most recent call last):
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\PIL\Image.py", line 2613, in open
fp.seek(0)
AttributeError: 'numpy.ndarray' object has no attribute 'seek'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/HP/PycharmProjects/nayaproject/new.py", line 17, in <module>
text = pytesseract.image_to_string(Image.open(g))
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\PIL\Image.py", line 2615, in open
fp = io.BytesIO(fp.read())
AttributeError: 'numpy.ndarray' object has no attribute 'read'
而不是 Image.open(灰度),当我使用 Image.fromarray(灰度)时,我得到了这些错误:
Traceback (most recent call last):
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 170, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\HP\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/HP/PycharmProjects/nayaproject/new.py", line 17, in <module>
text = pytesseract.image_to_string(Image.fromarray(g))
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 294, in image_to_string
return run_and_get_output(*args)
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 202, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\HP\PycharmProjects\nayaproject\venv\lib\site-packages\pytesseract\pytesseract.py", line 172, in run_tesseract
raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
我正在 PyCharm,我已经为这个项目安装了 Pillow、numpy、opencv-python、pip 和 pytesseract。
因为我猜 gray_scale_image 是从 OpenCV 输出的,因此是错误提示的 numpy 数组
AttributeError: 'numpy.ndarray' object has no attribute 'read'
您需要将数组转换为 PIL 对象。根据我自己的经验,我建议您自动将 numpy 数组转换为 np.uint8,因为 PIL 使用 8 位并且您通常不了解 OpenCV 算法的内容。
text = pytesseract.image_to_string(Image.fromarray(gray_scale_image.astype(np.uint8)))
如果上面提到的不起作用,你绝对不要传递任何形式的图像数组。尝试键入这些以查找 arguemnt 的字符:
print(type(gray_scale_image))
print(gray_scale_image.shape)
这将解决您的第一个问题后,还会出现您还不知道的新问题。您需要将路径添加到您的 pytesseract
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
解决方法是在开头加上你的路径
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR'