不同的帧率计算方法产生截然不同的结果

Different frame rate calculation methods produce very different results

作为基于 OpenCV 和 MediaPipe(来自 Google)的手势识别系统的一部分,我调查了可能的帧率。 首先使用方法 1 中的代码(部分来自 YouTube 视频,部分来自 Mediapipe 示例代码)——它使用帧计数器和传递的总时间来计算帧速率。 方法 2 的代码使用了一种稍微不同的方法——开始和结束时间用于确定处理帧的时间以给出帧速率。然后将瞬时帧速率添加到一个列表中,该列表被平均以给出平均帧速率(超过 10 帧)。

我系统上的方法 1 达到了 30fps 的帧率(在达到此值之前需要一些稳定时间 - 比如 5-10 秒)。方法 2,当 if results.multi_hand_landmarks 行为假时,达到 100-110fps。当 if results.multi_hand_landmarks 为真时,帧速率下降到大约 60fps(双重方法 1)。

方法 1 的帧速率不会根据 if results.multi_hand_landmarks 而变化,但会通过增加 cv2.waitKey(5) 的值来降低。方法 2 在增加相同的值时表现出不同的行为 - 随着值的增加,帧速率将增加到达到最大帧速率的点,但随着等待时间的进一步增加,这种情况会下降。

根据相机的规格(见下文),我怀疑正确的帧速率是 30fps,但这并不能解释方法 2 中的值。

我已经通过这两种方法消除了可能影响帧速率的其他来源,但很快(至少对我而言)很明显,这是用于计算帧速率的方法。所以我问是否有人可以阐明为什么这两种方法会产生不同的结果。对我来说,每种方法的逻辑似乎都是有效的,但我一定遗漏了一些东西。

Windows10; Python - 3.7.9; OpenCV-4.5.4;媒体管道 - 0.8.9; CPU 驱动(无 GPU);网络摄像头 - 罗技 C920(30fps @ 720p/1080p)

方法一

import cv2
import mediapipe as mp
import numpy as np
import sys
import time

def main():
    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    
    # start processing the video feed
    capture(cap)

def capture(cap):
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(min_detection_confidence=0.91, min_tracking_confidence=0.91)

    # used for displaying hand and other information in separate window
    mpDraw = mp.solutions.drawing_utils

    # used for displaying hand and other information in separate window
    # initialize time and frame count variables
    last_time = time.time()
    frames = 0

    while True:
        # blocks until the entire frame is read
        success, img = cap.read()

        # used for displaying hand and other information in separate window
        img = cv2.cvtColor(cv2.flip(img,1), cv2.COLOR_BGR2RGB)
        
        # process the image
        results = hands.process(img)        
        
        img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        
        # if results and landmarks exist process as needed
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:
                # used for displaying hand and landmarks in separate window
                mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)
                # Other code goes here to process landmarks

        # used for displaying hand and other information in separate window
        # compute fps: current_time - last_time
        frames += 1
        delta_time = time.time() - last_time
        cur_fps = np.around(frames / delta_time, 1)

        # used for displaying hand and other information in separate window
        cv2.putText(img, 'FPS: ' + str(cur_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", img)
        
        if cv2.waitKey(5) & 0xFF == 27:
            break

if __name__ == "__main__":
    main()

方法二

import cv2
import mediapipe as mp
import numpy as np
import sys
import time

def main():
    cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
    
    # start processing the video feed
    capture(cap)

def capture(cap):
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(min_detection_confidence=0.91, min_tracking_confidence=0.91)

    # used for displaying hand and other information in separate window
    mpDraw = mp.solutions.drawing_utils
    
    # Initialise list to hold instantaneous frame rates
    fps = [0]

    while True:        
        # start time
        start = time.time()
        
        # blocks until the entire frame is read
        success, image = cap.read()

        # used for displaying hand and other information in separate window
        imageRGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        results = hands.process(imageRGB)
        
        # if results and landmarks exist process as needed
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:
                # used for displaying hand and landmarks in separate window
                mpDraw.draw_landmarks(image, handLms, mpHands.HAND_CONNECTIONS)
                # Other code goes here to process landmarks
        
        # Define end time
        end = time.time()
        
        # Elapsed time between frames
        elapsed_time = (end - start)
        
        if elapsed_time != 0:
            # Calculate the current instantaneous frame rate
            cur_fps = 1/elapsed_time
            # Append to end of list
            fps.append(cur_fps)
            # Maintain length of list
            if len(fps) == 10:
                del fps[0]
        
        # Calculate the average frame rate
        ave_fps = np.around(sum(fps)/len(fps))

        # used for displaying hand and other information in separate window
        cv2.putText(image, 'FPS: ' + str(ave_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", image)
        
        if cv2.waitKey(35) & 0xFF == 27:
            break
            
    cap.release()
        
if __name__ == "__main__":
    main()

#1和#2当然有不同的结果

在 #1 中,这些行也计入 delta_time

        cur_fps = np.around(frames / delta_time, 1)

        # used for displaying hand and other information in separate window
        cv2.putText(img, 'FPS: ' + str(cur_fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow("Image", img)
        
        if cv2.waitKey(5) & 0xFF == 27:
            break

这使您的 FPS 更有效,因为它涵盖了整个循环时间

在 #2 中,您没有完全计算 FPS,因为您没有将上述代码行计算在内