使用 python mss 在屏幕记录顶部绘制边界框

Question

我有用于屏幕录制的代码，对于每一帧，我都有一组要在每一帧上显示的边界框。我可以使用 matplotlib 或其他方法来做到这一点，但我 mss 的工作速度大约为 30fps，我需要能够快速显示边界框。

我在文档中注意到 this example 但我尝试运行但无法让它显示任何内容。而且我什至不确定这是否适用于我的示例。

import cv2
import time
import numpy as np
from mss import mss

with mss() as sct:
        # Part of the screen to capture
        monitor = {"top": 79, "left": 265, "width": 905, "height": 586}

        while "Screen capturing":
            last_time = time.time()

            # Get raw pixels from the screen, save it to a Numpy array
            screen = np.array(sct.grab(monitor))

            # print("fps: {}".format(1 / (time.time() - last_time)))

            print('loop took {} seconds'.format(time.time()-last_time))
            last_time = time.time()
            screen = cv2.cvtColor(screen, cv2.COLOR_BGR2RGB)
            screen = cv2.resize(screen, (224,224)).astype(np.float32)/255

            # Display the picture
            cv2.imshow("OpenCV/Numpy normal", screen)

            # Press "q" to quit
            if cv2.waitKey(25) & 0xFF == ord("q"):
                cv2.destroyAllWindows()
                break

现在假设我有一组边界框要显示在每一帧上，例如，

bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]

我可以通过某种方式改变像素来显示它吗？我想我也可以通过 opencv 来完成，因为这就是最终显示屏幕的内容。

编辑：参考关于边界框的评论，它们是 x1, y1, width, height，并且在调整大小的 (224,224) 图像中

Answer 1

缺少一些细节：

边界框的格式是什么：[x1, y1, x2, y2] 或 [x1, y1, width, height] 或其他格式？
边界框值是在调整后的范围内 (224, 224) 还是在原始范围内？

无论如何，你可以使用下面的函数来绘制矩形（你需要根据格式来选择）：

def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
    for bbox in bboxes:
        # if [x1, y1, x2, y2]
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[-2:]), color, thickness)
        # if [x1, y1, width, height]
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)

假设您定义了 bboxes，您可以调用函数：

如果要在原帧上绘制：

# [...]
screen = np.array(sct.grab(monitor))
draw_bboxes(screen, bboxes)
# [...]

如果要在调整后的框架上绘图：

# [...]
screen = cv2.resize(screen, (224,224)).astype(np.float32)/255
draw_bboxes(screen, bboxes)
# [...]

经过一些改动，完整代码如下所示：

import cv2
import time
import numpy as np
from mss import mss

def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
    for bbox in bboxes:
        cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)

# bounding boxes
bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]

with mss() as sct:
    # part of the screen to capture
    monitor = {"top": 79, "left": 265, "width": 905, "height": 586}
    while "Screen capturing":
        # get screen
        last_time = time.time()
        screen = np.asarray(sct.grab(monitor))
        print('loop took {} seconds'.format(time.time()-last_time))

        # convert from BGRA --> BGR
        screen = cv2.cvtColor(screen, cv2.COLOR_BGRA2BGR)
        # resize and draw bboxes
        screen = cv2.resize(screen, (224,224))
        draw_bboxes(screen, bboxes)

        # display
        cv2.imshow("OpenCV/Numpy normal", screen)

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break

输出将是这样的：

使用 python mss 在屏幕记录顶部绘制边界框

Using python mss to draw bounding box on top of screen record

python

opencv

python-mss