Google 每个文本的 Cloud Vision 准确度 returns 0.0

Question

我正在使用 google 云视觉 OCR 来检测图像中的文本。我在 google 提供的文本之后尝试了 .confidence，但它总是 returns 为 0.0

response = client.document_text_detection(image=image_googlecloud)
texts = response.text_annotations

texts[0].confidence == 0.0

###This is the part of output of the response variable (the last few lines)###
                y: 2657
              }
            }
            text: "E"
            confidence: 1.0
          }
          confidence: 0.9900000095367432
        }
        confidence: 0.9900000095367432
      }
      block_type: TEXT
      confidence: 0.9900000095367432
    }
  }

当我打印响应变量时，响应变量具有所有置信度值（均大于 0.0），但是当我尝试获取某个单词的置信度（在上述方法中）时，它 returns 0.0。有没有办法解决这个问题来获得每个单词的置信度？

Answer 1

DOCUMENT_TEXT_DETECTION 提取的文本结构遵循此层次结构：

TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol.

因此，为了获得每个单词的置信度，您必须遍历结构组件。

您可以参考下面提到的代码来获取每个单词的置信度。

我图片中的文字：“早上好，千里之行，始于足下。”

代码：

def detect_document_uri(uri):
   """Detects document features in the file located in Google Cloud
   Storage."""
   from google.cloud import vision
   client = vision.ImageAnnotatorClient()
   image = vision.Image()
   image.source.image_uri = uri

   response = client.document_text_detection(image=image)

   for page in response.full_text_annotation.pages:
       for block in page.blocks:
          
           for paragraph in block.paragraphs:
              
               for word in paragraph.words:
                   words = ''.join([
                       symbol.text for symbol in word.symbols
                   ])
                   print('Words: {} (confidence: {})'.format(
                       words, word.confidence))

   if response.error.message:
       raise Exception(
           '{}\nFor more info on error messages, check: '
           'https://cloud.google.com/apis/design/errors'.format(
               response.error.message))

detect_document_uri("gs://your_bucket_name/image.jpg")

输出：

本地机器代码：

def detect_document(path):
    """Detects document features in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    # [START vision_python_migration_document_text_detection]
    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.document_text_detection(image=image)

    for page in response.full_text_annotation.pages:
        for block in page.blocks:
            
            for paragraph in block.paragraphs:
               
                for word in paragraph.words:
                    word_text = ''.join([
                        symbol.text for symbol in word.symbols
                    ])
                    print('Word text: {} (confidence: {})'.format(
                        word_text, word.confidence))

                    
    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))
                
detect_document("path of image from local machine")

输出：

Google 每个文本的 Cloud Vision 准确度 returns 0.0

Google Cloud Vision accuracy for each text returns 0.0

google-cloud-vision