google 视觉导致太多文本注释

Too many textAnnotations result from google vision

我已尝试请求 TEXT_DETECTION 和 1 个 maxResults,这是正文 json 示例:

{
  "requests": [
    {

      "image": {
          "content": "",
          "source": {
              "gcsImageUri": "",
              "imageUri": "https://www.optumhealthfinancial.com/content/dam/optumhealthfinancial/Images/receipts.gif"
            }
        },
      "features": [
        {
          "type": "TEXT_DETECTION",
          "maxResults": 1
        }
      ]
    }
  ]
}

但是 textAnnotations 的结果有不止一条记录 & 超过 1MB 的响应大小。

来自Text detection responses的描述:

A TEXT_DETECTION response includes the detected phrase, its bounding box, and individual words and their bounding boxes

所以您的示例图像中的每个单词都有边界框。此外,从 TextAnnotation

的定义

TextAnnotation contains a structured representation of OCR extracted text. The hierarchy of an OCR extracted text structure is like this: TextAnnotation -> Page -> Block -> Paragraph -> Word ->

结果的大小取决于示例图像中包含的信息量。 maxResults 适用于可能存在多个结果(faceAnnotation、textAnnotations)的情况,如 here 所述。您没有得到多个结果,检测到的段落中的每个单词都有一个结果。

如果您想要较小的结果,运行 请求使用 DOCUMENT_TEXT_DETECTION:

{
  "requests": 
  [
    {
      "image": 
      {
        "content": "",
        "source": 
        {
          "gcsImageUri": "",
          "imageUri": "https://www.optumhealthfinancial.com/content/dam/optumhealthfinancial/Images/receipts.gif"
        }
      },
      "features": 
      [
        {
          "type": "DOCUMENT_TEXT_DETECTION",
          "maxResults": 1
        }
      ]
    }
  ]
}

maxResults 不适用于 TEXT_DETECTION

Maximum number of results of this type. Does not apply to TEXT_DETECTION, DOCUMENT_TEXT_DETECTION, or CROP_HINTS.