在具有可变帧率的视频的每一帧上应用自定义函数

Apply custom function on each frame of a video with variable framerate

我正在尝试对视频的每一帧应用自定义 python 函数,并使用修改后的帧创建视频作为输出。我的输入视频是一个 mkv 文件,帧率可变,我想得到与输出相同的东西,所以输入中的一帧与输出中的一帧完全相同。

我尝试使用 ffmpeg-this examplepython。但是,时间戳信息似乎在管道中丢失了。当输入只有 300 帧时,输出视频有 689 帧(持续时间也不匹配,输入为 27 秒对 11 秒)。

我还尝试先处理视频中的每一帧并将转换后的版本保存为 PNG。然后我用处理过的帧“屏蔽”了输入视频。这似乎更好,因为输出视频与输入视频具有相同的 11 秒持续时间,但帧数不匹配(313 对 300)。

python-ffmpeg 解决方案的代码:

width = 1920
height = 1080
process1 = (
    ffmpeg
    .input(in_filename)
    .output('pipe:', format='rawvideo', pix_fmt='rgb24')
    .run_async(pipe_stdout=True)
)

process2 = (
    ffmpeg
    .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
    .output(out_filename, pix_fmt='yuv420p')
    .overwrite_output()
    .run_async(pipe_stdin=True)
)

while True:
    in_bytes = process1.stdout.read(width * height * 3)
    if not in_bytes:
        break
    in_frame = (
        np
        .frombuffer(in_bytes, np.uint8)
        .reshape([height, width, 3])
    )

    # Just add 1 to the pixels for the example
    out_frame = in_frame + 1

    process2.stdin.write(
        out_frame
        .astype(np.uint8)
        .tobytes()
    )

process2.stdin.close()
process1.wait()
process2.wait()

叠加解决方案代码:

ffmpeg -i in.mkv -i test/%d.png -filter_complex "[0][1]overlay=0:0" -copyts out.mkv

是否有任何其他我没有考虑过的解决方案来执行我正在尝试做的事情?它似乎没有那么复杂,但我找不到办法。

感谢您的帮助!

更新:

这是 python-ffmpeg 解决方案的输入和输出管道的日志。

输入

Input #0, matroska,webm, from 'in.mkv':
  Metadata:
    ENCODER         : Lavf59.17.100
  Duration: 00:00:11.48, start: 0.000000, bitrate: 45702 kb/s
  Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuvj420p(pc, gbr/unknown/unknown, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 60 fps, 60 tbr, 1k tbn (default)
    Metadata:
      ENCODER         : Lavc58.134.100 h264_nvenc
      DURATION        : 00:00:11.483000000
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
Output #0, rawvideo, to 'pipe:':
  Metadata:
    encoder         : Lavf59.17.100
  Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24(pc, gbr/unknown/unknown, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 2985984 kb/s, 60 fps, 60 tbn (default)
    Metadata:
      DURATION        : 00:00:11.483000000
      encoder         : Lavc59.20.100 rawvideo
frame=  689 fps=154 q=-0.0 Lsize= 4185675kB time=00:00:11.48 bitrate=2985984.1kbits/s dup=389 drop=0 speed=2.57x

输出

Input #0, rawvideo, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: 1244160 kb/s
  Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 1920x1080, 1244160 kb/s, 25 tbr, 25 tbn
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (libx264))
[libx264 @ 0000025afaf11140] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0000025afaf11140] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 0000025afaf11140] 264 - core 164 r3081 19856cc - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'images/videos/out.mkv':
  Metadata:
    encoder         : Lavf59.17.100
  Stream #0:0: Video: h264 (H264 / 0x34363248), yuv420p(tv, progressive), 1920x1080, q=2-31, 25 fps, 1k tbn
    Metadata:
      encoder         : Lavc59.20.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=  689 fps= 11 q=-0.0 Lsize= 4185675kB time=00:00:11.48 bitrate=2985984.1kbits/s dup=389 drop=0 speed=0.181x    
video:4185675kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%

我会回答我自己的问题,因为我已经在评论中的 kesh 的帮助下解决了这个问题。

基本上有两件事:

    输入视频需要
  • vsync passthrough,以保持帧数
  • 另一个外部工具 (MKVToolNix) 必须使用两次才能从初始视频中提取时间戳并将其应用于输出

下面是使用 python 和子进程执行整个操作的相关代码。您可以在输入和输出视频上使用以下行来检查每一帧的时间戳是否确实相同:ffprobe -show_entries packet=pts_time,duration_time,stream_index video.mkv

    width = 1920
    height = 1080

    process1 = (
        ffmpeg
        .input('in.mkv', vsync='passthrough')
        .output('pipe:', format='rawvideo', pix_fmt='rgb24')
        .run_async(pipe_stdout=True)
    )

    process2 = (
        ffmpeg
        .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
        .output('temp.mkv', pix_fmt='yuv420p')
        .overwrite_output()
        .run_async(pipe_stdin=True)
    )

    while True:
        in_bytes = process1.stdout.read(width * height * 3)
        if not in_bytes:
            break
        in_frame = (
            np
            .frombuffer(in_bytes, np.uint8)
            .reshape([height, width, 3])
        )
        # Keep thing simple, just add 1 to each pixel    
        out_frame = in_frame + 1

        process2.stdin.write(
            out_np
            .astype(np.uint8)
            .tobytes()
        )

    process2.stdin.close()
    process1.wait()
    process2.wait()

    # Extract timestamps from input video
    subprocess.run(['mkvextract', 'in.mkv', 'timestamps_v2', '0:timestamps.txt'])
    # Apply extracted timestamps to create synchronized output video
    subprocess.run(['mkvmerge', '-o', 'out.mkv', '--timestamps', '0:timestamps.txt', 'temp.mkv'])

    # Clean up
    os.remove('temp.mkv')
    os.remove('timestamps.txt')