如何从 Mediapipe Handpose 的视频中获取连续的地标

How can I get continuous landmarks from video for Mediapipe Handpose

我是 Javascript 的新手。我正在尝试从 MediaPipe Handpose 获取输出。当我将图像输入该模型时,我很容易得到输出。但是当我尝试播放视频时,它不起作用。 这是头

<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/handpose"></script>

我的视频 ID 详情

<a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video">Source</a><br>
<video id="video" src="o_new.mp4" width="640" height="480" controls> 
</video>
<canvas id="canvas" style="overflow:auto"></canvas>

脚本内部

<script>
     const video = document.getElementById("video");   

     
    async function main() {
  // Load the MediaPipe handpose model.
  const model = await handpose.load(maxContinuousChecks = 60);
        console.log('Model Loaded')

  // Pass in a video stream (or an image, canvas, or 3D tensor) to obtain a
  // hand prediction from the MediaPipe graph.
  const predictions = await model.estimateHands(video);
        console.log('Estimated Hand')
        console.log(predictions)
  if (predictions.length > 0) {
    /*
    `predictions` is an array of objects describing each detected hand, for example:
    [
      {
        handInViewConfidence: 1, // The probability of a hand being present.
        boundingBox: { // The bounding box surrounding the hand.
          topLeft: [162.91, -17.42],
          bottomRight: [548.56, 368.23],
        },
        landmarks: [ // The 3D coordinates of each hand landmark.
          [472.52, 298.59, 0.00],
          [412.80, 315.64, -6.18],
          ...
        ],
        annotations: { // Semantic groupings of the `landmarks` coordinates.
          thumb: [
            [412.80, 315.64, -6.18]
            [350.02, 298.38, -7.14],
            ...
          ],
          ...
        }
      }
    ]
    */

    for (let i = 0; i < predictions.length; i++) {
      const keypoints = predictions[i].landmarks;
        console.log('keypoints Loop')

      // Log hand keypoints.
      for (let i = 0; i < keypoints.length; i++) {
        const [x, y, z] = keypoints[i];
        console.log(`Keypoint ${i}: [${x}, ${y}, ${z}]`);
          
      }
    }
  }
}
    
 
main();
</script>

如何在视频的输出对象中获取连续的地标?

Here is the error.

我正在更新我之前的回答。您不想像我在以前的解决方案中那样使用设置间隔。当我使用它超过几分钟时,它会填满我的 gpu 内存并导致 webgl 崩溃。我能够梳理开发人员 demo.js 文件,并找到了解决方案。在您的 js 文件中,将您的 main() 函数替换为以下代码:



const state = {
  backend: 'webgl'
};


let model;


async function main() {
  await tf.setBackend(state.backend);
  model = await handpose.load();

  landmarksRealTime(video);
}

const landmarksRealTime = async (video) => {
  async function frameLandmarks() {
    const predictions = await model.estimateHands(video);
    
    if (predictions.length > 0) {
      const result = predictions[0].landmarks;
      console.log(result, predictions[0].annotations);

    }
    rafID = requestAnimationFrame(frameLandmarks);
  };

  frameLandmarks();
};


video.addEventListener("loadeddata", main);

此控制台记录连续的地标。如果未检测到手,它不会记录任何内容。此外,开发人员似乎在几天前更新了推荐的脚本标签,因此我建议更新您的 index.html 文件。它们应该是:

<!-- Require the peer dependencies of handpose. -->
<script src="https://unpkg.com/@tensorflow/tfjs-core@2.1.0/dist/tf-core.js"></script>
<script src="https://unpkg.com/@tensorflow/tfjs-converter@2.1.0/dist/tf-converter.js"></script>

<!-- You must explicitly require a TF.js backend if you're not using the tfs union bundle. -->
<script src="https://unpkg.com/@tensorflow/tfjs-backend-webgl@2.1.0/dist/tf-backend-webgl.js"></script>
<!-- Alternatively you can use the WASM backend: <script src="https://unpkg.com/@tensorflow/tfjs-backend-wasm@2.1.0/dist/tf-backend-wasm.js"></script> -->

<script src="https://unpkg.com/@tensorflow-models/handpose@0.0.6/dist/handpose.js"></script></head>