使用 googleapiclient 将图像文件注释为文本

Question

我正在尝试使用 google 云服务来注释本地图像文件。我按照此处给出的说明进行操作 https://cloud.google.com/natural-language/docs/reference/libraries，并设置了 google API。页面上给定的测试示例执行没有任何问题。但是，当我尝试实际注释文件时出现错误，这是我正在使用的代码：

files = [];
files.append("/opt/lampp/htdocs/test.jpg");

def get_text_from_files(fileNames):
    texts = detect_text(fileNames);

def detect_text(fileNames):
    max_results = 6;
    num_retries=3;
    service = googleapiclient.discovery.build('language', 'v1');
    batch_request = [];
    for filename in fileNames:
        request = {
            'image': {},
            'features': [{
                'type': 'TEXT_DETECTION',
                'maxResults': max_results,
            }]
        }

        with open(filename, 'rb') as image_file:
            request['image']['content'] = base64.b64encode(image_file.read()).decode('UTF-8');
        batch_request.append(request);

    request = service.images().annotate(body={'requests': batch_request});

    try:
        responses = request.execute(num_retries=num_retries)
        if 'responses' not in responses:
            return {};

        text_response = {};
        for filename, response in zip(
                input_filenames, responses['responses']):

            if 'error' in response:
                logging.error('API Error for {}: {}'.format(
                    filename,
                    response['error'].get('message', '')))
                continue

            text_response[filename] = response.get('textAnnotations', [])

        return text_response;

    except googleapiclient.errors.HttpError as e:
        print ('Http Error for {}: {}', e)
    except KeyError as e2:
        print ('Key error: {}', e2)


get_text_from_files(files);

但是我遇到了错误，我在下面给出了堆栈跟踪：

Traceback (most recent call last):
  File "test.py", line 68, in <module>
    get_text_from_files(pdf);
  File "test.py", line 21, in get_text_from_files
    texts = detect_text(fileNames);
  File "test.py", line 41, in detect_text
    request = service.images().annotate(body={'requests': batch_request});
AttributeError: 'Resource' object has no attribute 'images'

提前致谢。

Answer 1

请注意，您使用了错误的 Google API 客户端 Python 库。您正在使用 Natural Language API, while the one that you want to use is the Vision API。错误消息 AttributeError: 'Resource' object has no attribute 'images' 表示与语言 API 关联的资源没有任何 images 属性。为了解决这个问题，进行以下更改应该就足够了：

# Wrong API being used
service = googleapiclient.discovery.build('language', 'v1');
# Correct API being used
service = googleapiclient.discovery.build('vision', 'v1');

在此 Google API 客户端库页面中，您将找到 the whole list of available APIs and their names and versions available. And here, there's the complete documentation for the Vision API legacy API Client Library。

最后给大家推荐一下idiomatic Client Libraries instead of the legacy API Client Libraries. They are much more intuitive to use and there are some good documentation references in their GitHub page的用法。

使用 googleapiclient 将图像文件注释为文本

Annotating image file to text using googleapiclient

python

google-cloud-vision