Google云视觉API阅读pdf时出错

Question

我目前正在尝试使用 google 云视觉 API 处理大型 pdf 文档。阅读文档时，我收到一条错误消息，内容为“json_format.Parse( error”。我在下面附上了我的代码。我该如何解决这个问题？ Code

Answer 1

您在该行代码中收到错误，因为您试图通过 类型传递 json_string： 和一个不存在的对象 vision.types.AnnotateFilesResponse() 到 json_format.Parse() 需要：

google.protobuf.json_format.Parse(text, message,ignore_unknown_fields=False, descriptor_pool=None) Parses a JSON representation of a protocol message into a message.

Parameters:

text – Message JSON representation.

message – A protocol buffer message to merge into.

ignore_unknown_fields – If True, do not raise errors for unknown fields.

descriptor_pool – A Descriptor Pool for resolving types. If None use the default.

Returns The same message passed as argument.

Raises:: ParseError: On JSON parsing problems.

由于您的目标是读取来自 async_batch_annotate_files() 的响应，因此来自此方法的 JSON 响应将保存到定义的 Cloud Storage Bucket 输出位置。您可以通过将数据转换为字典来读取和解析 json_string 中的数据。然后，您可以通过参考 AnnotateFileResponse reference 在字典中按自己的方式工作。使用以下代码：

output = blob_list[0]
json_string = output.download_as_string()
response = json.loads(json_string)
first_page_response = response['responses'][0]
annotation = first_page_response['fullTextAnnotation']

print('Full text:\n')
print(annotation['text'])

注意：只要确保您获得正确的 JSON 响应文件 (output = blob_list[0])，否则结果解析将产生错误。

Google云视觉API阅读pdf时出错

Google cloud vision API error reading pdf

google-cloud-vision