二值图像 OCR
OCR on binary image
我有一个像这样的二进制文本图像black on white text - cat
我想对这些图像进行 OCR。它们不超过一个词。
我试过 tesseract 和 Google cloud vision 但它们都 return 没有结果。
我正在使用 python 3.6 和 Windows 10.
# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
# Instantiates a client
client = vision.ImageAnnotatorClient()
with io.open("test.png", 'rb') as image_file:
content = image_file.read()
image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
resp = ''
for text in texts:
resp+=' ' + text.description
print(resp)
from PIL import Image as im
import pytesseract as ts
print(ts.image_to_string(im.fromarray(canvas.reshape((480,640)),'L'))) # canvas contains the Mat object from which the image is saved to png
这张图片对于两者中的任何一个来说都应该是一个简单的任务,我觉得我的代码中遗漏了一些东西。请帮帮我!
编辑:
感谢 F10 为我指明了正确的方向。这就是我使用本地图像的方式。
# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
from google.cloud.vision import enums
# Instantiates a client
client = vision.ImageAnnotatorClient()
with io.open("test.png", 'rb') as image_file:
content = image_file.read()
features = [
types.Feature(type=enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
]
image = types.Image(content=content)
request = types.image_annotator_pb2.AnnotateImageRequest(image=image, features=features)
response = client.annotate_image(request)
print(response)
基于this document,我使用了下面的代码,我能够得到text: "cat\n"
作为输出:
from pprint import pprint
# Imports the Google Cloud client library
from google.cloud import vision
# Instantiates a client
client = vision.ImageAnnotatorClient()
# The name of the image file to annotate
response = client.annotate_image({
'image': {'source': {'image_uri': 'gs://<your_bucket>/ORW90.png'}},
'features': [{'type': vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}],
})
pprint(response)
希望对您有所帮助。
我有一个像这样的二进制文本图像black on white text - cat
我想对这些图像进行 OCR。它们不超过一个词。 我试过 tesseract 和 Google cloud vision 但它们都 return 没有结果。 我正在使用 python 3.6 和 Windows 10.
# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
# Instantiates a client
client = vision.ImageAnnotatorClient()
with io.open("test.png", 'rb') as image_file:
content = image_file.read()
image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
resp = ''
for text in texts:
resp+=' ' + text.description
print(resp)
from PIL import Image as im
import pytesseract as ts
print(ts.image_to_string(im.fromarray(canvas.reshape((480,640)),'L'))) # canvas contains the Mat object from which the image is saved to png
这张图片对于两者中的任何一个来说都应该是一个简单的任务,我觉得我的代码中遗漏了一些东西。请帮帮我!
编辑:
感谢 F10 为我指明了正确的方向。这就是我使用本地图像的方式。
# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
from google.cloud.vision import enums
# Instantiates a client
client = vision.ImageAnnotatorClient()
with io.open("test.png", 'rb') as image_file:
content = image_file.read()
features = [
types.Feature(type=enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
]
image = types.Image(content=content)
request = types.image_annotator_pb2.AnnotateImageRequest(image=image, features=features)
response = client.annotate_image(request)
print(response)
基于this document,我使用了下面的代码,我能够得到text: "cat\n"
作为输出:
from pprint import pprint
# Imports the Google Cloud client library
from google.cloud import vision
# Instantiates a client
client = vision.ImageAnnotatorClient()
# The name of the image file to annotate
response = client.annotate_image({
'image': {'source': {'image_uri': 'gs://<your_bucket>/ORW90.png'}},
'features': [{'type': vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}],
})
pprint(response)
希望对您有所帮助。