不同的帧率计算方法产生截然不同的结果

Question

作为基于 OpenCV 和 MediaPipe（来自 Google）的手势识别系统的一部分，我调查了可能的帧率。首先使用方法 1 中的代码（部分来自 YouTube 视频，部分来自 Mediapipe 示例代码）——它使用帧计数器和传递的总时间来计算帧速率。方法 2 的代码使用了一种稍微不同的方法——开始和结束时间用于确定处理帧的时间以给出帧速率。然后将瞬时帧速率添加到一个列表中，该列表被平均以给出平均帧速率（超过 10 帧）。

我系统上的方法 1 达到了 30fps 的帧率（在达到此值之前需要一些稳定时间 - 比如 5-10 秒）。方法 2，当 if results.multi_hand_landmarks 行为假时，达到 100-110fps。当 if results.multi_hand_landmarks 为真时，帧速率下降到大约 60fps（双重方法 1）。

方法 1 的帧速率不会根据 if results.multi_hand_landmarks 而变化，但会通过增加 cv2.waitKey(5) 的值来降低。方法 2 在增加相同的值时表现出不同的行为 - 随着值的增加，帧速率将增加到达到最大帧速率的点，但随着等待时间的进一步增加，这种情况会下降。

根据相机的规格（见下文），我怀疑正确的帧速率是 30fps，但这并不能解释方法 2 中的值。

我已经通过这两种方法消除了可能影响帧速率的其他来源，但很快（至少对我而言）很明显，这是用于计算帧速率的方法。所以我问是否有人可以阐明为什么这两种方法会产生不同的结果。对我来说，每种方法的逻辑似乎都是有效的，但我一定遗漏了一些东西。

Windows10； Python - 3.7.9; OpenCV-4.5.4；媒体管道 - 0.8.9； CPU 驱动（无 GPU）；网络摄像头 - 罗技 C920（30fps @ 720p/1080p）

方法一

import cv2
import mediapipe as mp
import numpy as np
import sys
import time

def main():
    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    
    # start processing the video feed
    capture(cap)

def capture(cap):
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(min_detection_confidence=0.91, min_tracking_confidence=0.91)

    # used for displaying hand and other information in separate window
    mpDraw = mp.solutions.drawing_utils

    # used for displaying hand and other information in separate window
    # initialize time and frame count variables
    last_time = time.time()
    frames = 0

    while True:
        # blocks until the entire frame is read
        success, img = cap.read()

        # used for displaying hand and other information in separate window
        img = cv2.cvtColor(cv2.flip(img,1), cv2.COLOR_BGR2RGB)
        
        # process the image
        results = hands.process(img)        
        
        img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        
        # if results and landmarks exist process as needed
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:
                # used for displaying hand and landmarks in separate window
                mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)
                # Other code goes here to process landmarks

        # used for displaying hand and other information in separate window
        # compute fps: current_time - last_time
        frames += 1
        delta_time = time.time() - last_time
        cur_fps = np.around(frames / delta_time, 1)

        # used for displaying hand and other information in separate window
        cv2.putText(img, 'FPS: ' + str(cur_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", img)
        
        if cv2.waitKey(5) & 0xFF == 27:
            break

if __name__ == "__main__":
    main()

方法二

import cv2
import mediapipe as mp
import numpy as np
import sys
import time

def main():
    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    
    # start processing the video feed
    capture(cap)

def capture(cap):
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(min_detection_confidence=0.91, min_tracking_confidence=0.91)

    # used for displaying hand and other information in separate window
    mpDraw = mp.solutions.drawing_utils
    
    # Initialise list to hold instantaneous frame rates
    fps = [0]

    while True:        
        # start time
        start = time.time()
        
        # blocks until the entire frame is read
        success, image = cap.read()

        # used for displaying hand and other information in separate window
        imageRGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        results = hands.process(imageRGB)
        
        # if results and landmarks exist process as needed
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:
                # used for displaying hand and landmarks in separate window
                mpDraw.draw_landmarks(image, handLms, mpHands.HAND_CONNECTIONS)
                # Other code goes here to process landmarks
        
        # Define end time
        end = time.time()
        
        # Elapsed time between frames
        elapsed_time = (end - start)
        
        if elapsed_time != 0:
            # Calculate the current instantaneous frame rate
            cur_fps = 1/elapsed_time
            # Append to end of list
            fps.append(cur_fps)
            # Maintain length of list
            if len(fps) == 10:
                del fps[0]
        
        # Calculate the average frame rate
        ave_fps = np.around(sum(fps)/len(fps))

        # used for displaying hand and other information in separate window
        cv2.putText(image, 'FPS: ' + str(ave_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", image)
        
        if cv2.waitKey(35) & 0xFF == 27:
            break
            
    cap.release()
        
if __name__ == "__main__":
    main()

Answer 1

#1和#2当然有不同的结果

在 #1 中，这些行也计入 delta_time

        cur_fps = np.around(frames / delta_time, 1)

        # used for displaying hand and other information in separate window
        cv2.putText(img, 'FPS: ' + str(cur_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", img)
        
        if cv2.waitKey(5) & 0xFF == 27:
            break

这使您的 FPS 更有效，因为它涵盖了整个循环时间

在 #2 中，您没有完全计算 FPS，因为您没有将上述代码行计算在内

不同的帧率计算方法产生截然不同的结果

Different frame rate calculation methods produce very different results

python

opencv

frame-rate

mediapipe