Google 云视觉，将线条集中在一起

Question

我正在测试 Google 云视觉。我希望它按顺序逐行阅读整个页面。这是代码。

url = 'https://www.sec.gov/Archives/edgar/data/1633917/000163391720000091/q120paypalearningsreleas013.jpg'

def detect_text_uri(uri):
    """Detects text in the file located in Google Cloud Storage or on the Web.
    """
    from google.cloud import vision
    client = vision.ImageAnnotatorClient()
    image = vision.types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

if __name__ == '__main__': detect_text_uri(url)

您可以看到它在到达“每个活动帐户的付款交易”之前效果很好，然后将其与下一行混为一谈。它不再逐行进行。

我该如何解决这个问题？问题是当我查看文档时，我已经在使用文本检测功能。不确定如何进一步改善结果。

Answer 1

Google此关卡无法配置视野。

您有两个选项可以阅读文档中的文本

TEXT_DETECTION 运行文本检测/光学字符识别 (OCR)。文本检测针对较大图像中的文本区域进行了优化；如果图像是文档，请改用 DOCUMENT_TEXT_DETECTION。

DOCUMENT_TEXT_DETECTION 运行密集文本文档 OCR。当 DOCUMENT_TEXT_DETECTION 和 TEXT_DETECTION 都存在时优先。

如果TEXT_DETECTION和DOCUMENT_TEXT_DETECTIONreturn一样回答不满意你得自己修改图片

例如使用 Cloud demo api 你可以立即看到结果

我稍微更改了图像，并针对该特定行获得了更好的结果。

Img （裁剪并具有额外的对比度） result

请记住这只是一个示例，您需要找到足够的方法来修改图像

编辑：也许值得探索 Document AI

Answer 2

另一个答案是正确的，但我想指出 Document AI Table Parsing [beta] 是您想要的解决方案。

Table 解析将读取您的 table 并为您提供适当的换行符。我在你的照片上使用了 demo，它正确地读出了 table，没有错误。它在演示中需要 PDF，但会通过 API.

直接获取 JPG

为了获得最佳效果，还请提供 table 边界多边形，但在大多数情况下它还是会计算出来的：

bounding-poly (optional): A bounding box hint for a table on the page. This field is intended for complex cases when the model may have difficulty locating the table. The values must be normalized [0,1]. Object format:
{"x": X_MIN,"y": Y_MIN}, {"x": X_MAX,"y": Y_MIN},{"x": X_MAX,"y": Y_MAX},{"x": X_MIN,"y": Y_MAX}

注意：另一个答案中提到了文档 AI 以及编辑。

Google 云视觉，将线条集中在一起

Google cloud vision, lumping lines together

python

google-cloud-vision