使用 python mss 在屏幕记录顶部绘制边界框

Using python mss to draw bounding box on top of screen record

我有用于屏幕录制的代码,对于每一帧,我都有一组要在每一帧上显示的边界框。我可以使用 matplotlib 或其他方法来做到这一点,但我 mss 的工作速度大约为 30fps,我需要能够快速显示边界框。

我在文档中注意到 this example 但我尝试 运行 但无法让它显示任何内容。而且我什至不确定这是否适用于我的示例。

import cv2
import time
import numpy as np
from mss import mss

with mss() as sct:
        # Part of the screen to capture
        monitor = {"top": 79, "left": 265, "width": 905, "height": 586}

        while "Screen capturing":
            last_time = time.time()

            # Get raw pixels from the screen, save it to a Numpy array
            screen = np.array(sct.grab(monitor))

            # print("fps: {}".format(1 / (time.time() - last_time)))

            print('loop took {} seconds'.format(time.time()-last_time))
            last_time = time.time()
            screen = cv2.cvtColor(screen, cv2.COLOR_BGR2RGB)
            screen = cv2.resize(screen, (224,224)).astype(np.float32)/255

            # Display the picture
            cv2.imshow("OpenCV/Numpy normal", screen)

            # Press "q" to quit
            if cv2.waitKey(25) & 0xFF == ord("q"):
                cv2.destroyAllWindows()
                break

现在假设我有一组边界框要显示在每一帧上,例如,

bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]

我可以通过某种方式改变像素来显示它吗?我想我也可以通过 opencv 来完成,因为这就是最终显示屏幕的内容。

编辑:参考关于边界框的评论,它们是 x1, y1, width, height,并且在调整大小的 (224,224) 图像中

缺少一些细节:

  • 边界框的格式是什么:[x1, y1, x2, y2][x1, y1, width, height] 或其他格式?
  • 边界框值是在调整后的范围内 (224, 224) 还是在原始范围内?

无论如何,你可以使用下面的函数来绘制矩形(你需要根据格式来选择):

def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
    for bbox in bboxes:
        # if [x1, y1, x2, y2]
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[-2:]), color, thickness)
        # if [x1, y1, width, height]
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)

假设您定义了 bboxes,您可以调用函数:

  • 如果要在原帧上绘制:
# [...]
screen = np.array(sct.grab(monitor))
draw_bboxes(screen, bboxes)
# [...]
  • 如果要在调整后的框架上绘图:
# [...]
screen = cv2.resize(screen, (224,224)).astype(np.float32)/255
draw_bboxes(screen, bboxes)
# [...]

经过一些改动,完整代码如下所示:

import cv2
import time
import numpy as np
from mss import mss

def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
    for bbox in bboxes:
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)

# bounding boxes
bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]

with mss() as sct:
    # part of the screen to capture
    monitor = {"top": 79, "left": 265, "width": 905, "height": 586}
    while "Screen capturing":
        # get screen
        last_time = time.time()
        screen = np.asarray(sct.grab(monitor))
        print('loop took {} seconds'.format(time.time()-last_time))

        # convert from BGRA --> BGR
        screen = cv2.cvtColor(screen, cv2.COLOR_BGRA2BGR)
        # resize and draw bboxes
        screen = cv2.resize(screen, (224,224))
        draw_bboxes(screen, bboxes)

        # display
        cv2.imshow("OpenCV/Numpy normal", screen)

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break

输出将是这样的: