图像到文本 Pytesseract 错误
Image to Text Pytesseract Error
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
pytesseract.pytesseract.tesseract_cmd="C:\Program Files (x86)\Tesseract-
OCR\tesseract.exe"
im = Image.open("d:\ss.png") # the second one
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)
上面是将图片转换为文本的代码,但它显示以下错误:
Traceback (most recent call last):
File "D:\txt14.py", line 10, in <module>
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 193, in image_to_string
return run_and_get_output(image, 'txt', lang, config, nice)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 140, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 111, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
你能帮我弄清楚为什么会出现这个错误吗?
pytesseract.pytesseract.tesseract_cmd="C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
路径中的 \t
不是反斜杠和 t,而是制表符。
对于源代码中的 Windows 路径名,如果要使用反斜杠而不是正斜杠,请始终使用原始字符串文字。像这样:
pytesseract.pytesseract.tesseract_cmd=r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
在原始字符串文字中,\t
是反斜杠和 t,而不是制表符。
你应该对 'd:\ss.png'
做同样的事情——你很幸运,因为 \s
恰好不是任何东西的转义序列(至少不是 Python 3.6 ), 但安全总比后悔好。
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
pytesseract.pytesseract.tesseract_cmd="C:\Program Files (x86)\Tesseract-
OCR\tesseract.exe"
im = Image.open("d:\ss.png") # the second one
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)
上面是将图片转换为文本的代码,但它显示以下错误:
Traceback (most recent call last):
File "D:\txt14.py", line 10, in <module>
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 193, in image_to_string
return run_and_get_output(image, 'txt', lang, config, nice)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 140, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 111, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
你能帮我弄清楚为什么会出现这个错误吗?
pytesseract.pytesseract.tesseract_cmd="C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
路径中的 \t
不是反斜杠和 t,而是制表符。
对于源代码中的 Windows 路径名,如果要使用反斜杠而不是正斜杠,请始终使用原始字符串文字。像这样:
pytesseract.pytesseract.tesseract_cmd=r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
在原始字符串文字中,\t
是反斜杠和 t,而不是制表符。
你应该对 'd:\ss.png'
做同样的事情——你很幸运,因为 \s
恰好不是任何东西的转义序列(至少不是 Python 3.6 ), 但安全总比后悔好。