在 Python 中格式化来自 Cloud Vision API 的 OCR 文本注释

Question

我在一个小程序上使用 Google Cloud Vision API for Python。该功能正在运行，我得到了 OCR 结果，但我需要先格式化这些结果才能使用它们。

这是函数：

# Call to OCR API
def detect_text_uri(uri):
    """Detects text in the file located in Google Cloud Storage or on the Web.
    """
    client = vision.ImageAnnotatorClient()
    image = types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations

    for text in texts:
        textdescription = ("    "+ text.description )
        return textdescription

我特别需要逐行分割文本，并在开头添加四个空格，在结尾添加一个换行符，但目前这仅适用于第一行，其余部分作为单行 blob.

我一直在查看官方文档，但并没有真正了解 API 的响应格式。

Answer 1

你就快到了。因为你想逐行分割文本，而不是循环文本注释，尝试从 google 视觉的响应中获取直接的 'description'，如下所示。

def parse_image(image_path=None):
    """
    Parse the image using Google Cloud Vision API, Detects "document" features in an image
    :param image_path: path of the image
    :return: text content
    :rtype: str
    """

    client = vision.ImageAnnotatorClient()
    response = client.text_detection(image=open(image_path, 'rb'))
    text = response.text_annotations
    del response     # to clean-up the system memory

    return text[0].description

上面的函数returns图片中内容的字符串，以“\n”分隔的行

现在，您可以根据需要为每一行添加前缀和后缀。

image_content = parse_image(image_path="path\to\image")

my_formatted_text = ""
for line in image_content.split("\n"):
    my_formatted_text += "    " + line + "\n"

my_formatted_text是您需要的文字。

在 Python 中格式化来自 Cloud Vision API 的 OCR 文本注释

Format OCR text annotation from Cloud Vision API in Python

python

google-cloud-platform

google-cloud-vision