如何使用 Google Cloud Vision API 阅读一栏文本

How to read one column texts with Google Cloud Vision API

我有下一张文档图片

当我尝试将图像转换为文本时，结果是下一个：

置顶文字

Ref: Rad: Dte: Ddo:

Ejecutivo 76520400300 Banco de Bogotá Luz Adriana

按钮文本

问题是 Google API 像两列一样识别它，我如何配置 Google API 以获得一列文本？

我的目标是：

置顶文字

Ref:Ejecutivo Rad: 76520400300 Dte: Banco de Bogotá Ddo:Luz Adriana

按钮文本

Cloud Vision API 没有特定请求属性来指定用于读取或排序文件数据的格式。相反，我认为可用的解决方法是使用 BoundingPoly and Vertex response properties, that display the coordinates related to each word contained in the image, in order to process the vertices data within your code logic and define the text that need to be grouped by columns and rows. You can take a look on this link，其中包括一些包含这些属性的响应示例。

如果此功能无法满足您当前的需求，您可以使用 发送反馈 按钮，该按钮位于 service public documentation, as well as take a look the Issue Tracker tool in order to raise a Vision API feature request 并通知 Google 有关此所需功能的信息。

Google 团队成员回复说 Document AI works better than Cloud Vision as per the update on the issue

如何使用 Google Cloud Vision API 阅读一栏文本

How to read one column texts with Google Cloud Vision API

ocr

text-recognition

google-cloud-vision