对象定位 - Google 视觉 API

Object Localization - Google Vision API

我正在尝试重新创建: 这个:

问题是从 API 调用 objectLocalization 返回的数据仅包含归一化向量,向量数组为空。

到目前为止,我的代码与示例完全相同:

const vision = require('@google-cloud/vision');

const client = new vision.ImageAnnotatorClient();

const imageURL = './Laptop_2D00_and_2D00_Tablet_2D00_Table_5F00_69BC66ED.jpg';

client
    .objectLocalization(imageURL)
    .then(results => {
        const objects = results[0].localizedObjectAnnotations;
        objects.forEach(object => {
            console.log(`Name: ${object.name}`);
            console.log(`Confidence: ${object.score}`);
            const veritices = object.boundingPoly.normalizedVertices;
            veritices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
        });
    })
    .catch(err => {
        console.error('ERROR: ', err);
    });

结果是:

Name: Laptop
Confidence: 0.8650877475738525
x: 0.004973180592060089, y:0.27008256316185
x: 0.18256860971450806, y:0.27008256316185
x: 0.18256860971450806, y:0.5250381827354431
x: 0.004973180592060089, y:0.5250381827354431
Name: Computer keyboard
Confidence: 0.732001006603241
x: 0.20447060465812683, y:0.6251764893531799
x: 0.5232940912246704, y:0.6251764893531799
x: 0.5232940912246704, y:0.9054117202758789
x: 0.20447060465812683, y:0.9054117202758789
Name: Person
Confidence: 0.6957111954689026
x: 0.9150910377502441, y:0.03288845717906952
x: 0.9932186007499695, y:0.03288845717906952
x: 0.9932186007499695, y:0.31247377395629883
x: 0.9150910377502441, y:0.31247377395629883
Name: Laptop
Confidence: 0.6388971209526062
x: 0.20340178906917572, y:0.3301794230937958
x: 0.4965982437133789, y:0.3301794230937958
x: 0.4965982437133789, y:0.9114677906036377
x: 0.20340178906917572, y:0.9114677906036377
Name: Table
Confidence: 0.5609536170959473
x: 0, y:0.11000002175569534
x: 0.998235285282135, y:0.11000002175569534
x: 0.998235285282135, y:0.9940000176429749
x: 0, y:0.9940000176429749
Name: Computer keyboard
Confidence: 0.5245768427848816
x: 0.012653245590627193, y:0.4093095660209656
x: 0.16077089309692383, y:0.4093095660209656
x: 0.16077089309692383, y:0.5089566707611084
x: 0.012653245590627193, y:0.5089566707611084

我想让他们将这些对象映射到图像上,但我不知道如何使用提供的数据来做到这一点。

我已经解决了这个问题。

如果有人感兴趣,您可以将归一化向量乘以图像的高度。

使用 html5 canvas 将边界框绘制回图像的示例:

    img.onload = () => {
      canvas.width = 512;
      canvas.height = 340.5;

      ctx.drawImage(img, 0, 0, 512, 340.5);
      for(let i = 0; i < this.state.imageData.length; i++){
        ctx.beginPath();
        const startingPos = this.state.imageData[i].boundingPoly.normalizedVertices[0];
        ctx.moveTo(startingPos.x * canvas.width, startingPos.y * canvas.height);
        for(let j = 1; j < this.state.imageData[i].boundingPoly.normalizedVertices.length; j++){
          let pos = this.state.imageData[i].boundingPoly.normalizedVertices[j];
          ctx.lineTo(pos.x * canvas.width, pos.y * canvas.height);
        }
        ctx.lineTo(startingPos.x * canvas.width, startingPos.y * canvas.height);
        ctx.strokeStyle = '#ff0000';
        ctx.stroke();
      }
    }