Python OpenCV - 在根据特定对象条件保存视频时，并非所有此类帧都被保存

Question

我在 Python 中使用 opencv 并尝试 record/save 仅当帧中存在特定类型的 object/label 时 record/save 来自视频的那些帧 'umbrella'

问题：

它正确地从它第一次在帧中发现提到 object/label 的实例开始保存帧，但如果 object/label 在接下来的几帧中不存在并且仅在几帧之后出现，那么那些框架没有保存到我正在保存的 mp4 文件中。

它只保存带有提到对象的第一个连续帧，不保存后面的帧。

After reading suggestions from this link I edited code by putting frame writing steps within a for-loop as shown below:

写一段我尝试即兴创作的代码

# saving video frame by frame             
for frame_numb in range(total_frames):                
    if i == '':
        pass
    else:
        if "umbrella" in label:
            print("umbrella in labels")

            # Issue causing part where I may need some change
            out_vid.write(frame[frame_numb])

以上代码修改后的结果：

它只创建 256kb 的文件，并且文件无法打开/不写入任何内容

如果我对代码进行以下更改，那么它只会保存满足该条件的视频的第一帧，并在整个时间内运行同一帧

    # saving video frame by frame             
    for frame_numb in range(total_frames):                
        if i == '':
            pass
        else:
            if "umbrella" in label:
                print("umbrella in labels")

                # Issue causing part where I may need some change
                out_vid.write(frame)

下面分享更大的代码块以供参考：

def vid_objects_detection(type=0, confidence_threshold=0.5, image_quality=416):

    classes = []

    # reading category names from coco text file and inserting in classes list
    with open("coco.names", "r") as f:
        classes = [line.strip() for line in f.readlines()]

    net = cv2.dnn.readNet("yolov3-tiny.weights", "yolov3-tiny.cfg") # using tiny versions of weights & config file

    layer_names = net.getLayerNames()    
    output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]

    # Loading video
    cap = cv2.VideoCapture(type)  # use 0 for webcam   

    _, frame = cap.read()
    height, width, channels = frame.shape

    # providing codec for writing frames to video 
    fourcc = cv2.VideoWriter_fourcc(*'MP4V')

    # Write video with name & size. Should be of same size(width, height) as original video
    out_vid = cv2.VideoWriter('obj_detect4_'+str(type), fourcc, 20.0, (width,height))

    font = cv2.FONT_HERSHEY_COMPLEX_SMALL 
    starting_time = time.time()
    frame_id = 0

    while True:
        _, frame = cap.read()

        frame_id +=1

        total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
        height, width, channels = frame.shape       

        blob = cv2.dnn.blobFromImage(frame, 0.00392, (image_quality, image_quality), (0, 0, 0), True, crop=False)
        net.setInput(blob)

        outs = net.forward(output_layers)

        # For showing informations on screen
        class_ids = []
        confidences = []
        boxes = []
        for out in outs:
            for detection in out:
                # claculated scores, class_id, confidence                   

                if confidence > confidence_threshold:                      
                    # claculatedd center_x, center_y, w,h,x,y
                    boxes.append([x, y, w, h])
                    confidences.append(float(confidence))
                    class_ids.append(class_id)
                    print("confidences:", confidences)
                    print(class_ids)
                    print("boxes", boxes)

        indexes = cv2.dnn.NMSBoxes(boxes, confidences, confidence_threshold, 0.4)

        for i in range(len(boxes)):
            if i in indexes:
                x, y, w, h = boxes[i]
                label = str(classes[class_ids[i]])

        elapsed_time = time.time() - starting_time
        fps = frame_id / elapsed_time
        time_display = time.strftime("%a, %d%b%Y %H:%M:%S", time.localtime())
        cv2.putText(frame,"|FPS: " + str(round(fps,3)), (10, 40), font, 1, (0,255,0), 1)
        print(fps)

        # saving video frame by frame 
    if i == '':
        pass
    else:
        if 'umbrella' in label:
            out_vid.write(frame)

        key = cv2.waitKey(5)
        if key == 27: 
            break

    cap.release()
    out_vid.release()
    cv2.destroyAllWindows()

# calling function
vid_objects_detection("walking.mp4")

我在代码中删减了一些小的计算并插入了注释以减少代码的长度

Answer 1

有时视频编解码器会执行所谓的关键帧压缩。这意味着，一帧被完整存储，比如每 10 帧，而其间的所有其他帧都存储为更改或增量。在这些情况下，当您尝试仅保存这些中间帧时，它们可能无法保存。但是在这些情况下，如果您按顺序遍历每一帧，保存帧就会起作用。

也许您可以注释掉行 out_vid = cv2.VideoWriter('obj_detect4_'+str(type), fourcc, 20.0, (width,height))，然后根据您的情况尝试从网络摄像头流中保存帧。

Python OpenCV - 在根据特定对象条件保存视频时，并非所有此类帧都被保存

Python OpenCV - while saving video based on a specific object condition not all such frames are getting saved

python

opencv

artificial-intelligence

object-detection

python-3.x