异步批处理 Google 视觉文本检测

Question

我正在尝试使用 google 视觉一次从一批 15 个文档中检测文本，我希望它不是异步的。

不幸的是，client.async_batch_annotate_images() 函数花费的时间与 client.batch_annotate_images() 函数相同，如果我使用 client.document_text_detection()

我不确定为什么回复需要这么长时间，也许我做错了什么，我很想听听你的专家意见。

例如，这就是我使用批注图像提取文本的方式

        def batch_ocr_images():

            client = vision.ImageAnnotatorClient()

            requests = []
            features = [vision.Feature(type_=vision.Feature.Type.DOCUMENT_TEXT_DETECTION)]
            for image_file in images_files:
                with io.open(image_file, 'rb') as image_data:
                    content = image_data.read()

                image = vision.Image(content=content)
                request = vision.AnnotateImageRequest(image=image, features=features)
                requests.append(request)


            client.batch_annotate_images(requests=requests)

这是异步工作的方式吗？因为它的工作速度与我迭代图像并一次扫描一张图像时的速度相同。

提前致谢，亚尼夫

Answer 1

速度预计与 Google 的 back-end 上的速度配置相同。 user/developer 的结尾没有加速选项。异步和同步之间的唯一区别是，当您使用异步时，您甚至可以离线执行请求，并且与使用同步相比，您可以指定更大的文件批次。要阅读有关批处理图像注释的更多信息，您可以参考此 GCP Documentation。

异步批处理 Google 视觉文本检测

Async Batch Google Vision Text detection

python

python-3.x

google-cloud-vision