ffmpeg 在特定时间在 mp4 上叠加 png

ffmpeg overlay png on mp4 on specific time

我正在尝试在 mp4 视频上从第 3 秒到最后叠加一个图像。 我正在使用下面的

ffmpeg -i out.mp4 -i scaledSquare.png -filter_complex " [0:v][1] overlay=100:100:enable='gte(t,3)'" -y test.mp4

输出是下面的,test.mp4没有重叠的形状

ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with Apple LLVM version 9.0.0 (clang-900.0.39.2)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0.2 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:05.43, start: 0.000000, bitrate: 372 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 3360x2100 [SAR 1:1 DAR 8:5], 40214 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 73 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #1, png_pipe, from 'scaledSquare.png':
  Duration: N/A, bitrate: N/A
    Stream #1:0: Video: png, rgba(pc), 900x450 [SAR 1:2 DAR 1:1], 25 tbr, 25 tbn, 25 tbc
Stream mapping:
  Stream #0:0 (h264) -> overlay:main (graph 0)
  Stream #1:0 (png) -> overlay:overlay (graph 0)
  overlay (graph 0) -> Stream #0:0 (libx264)
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x7f95f0001800] using SAR=1/1
[libx264 @ 0x7f95f0001800] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7f95f0001800] profile High, level 5.1
[libx264 @ 0x7f95f0001800] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'test.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 3360x2100 [SAR 1:1 DAR 8:5], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 69 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc58.18.100 aac
frame=    1 fps=0.0 q=28.0 Lsize=     246kB time=00:00:05.38 bitrate= 374.4kbits/s speed=18.4x    
video:195kB audio:49kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.736449%
[libx264 @ 0x7f95f0001800] frame I:1     Avg QP:27.64  size:199019
[libx264 @ 0x7f95f0001800] mb I  I16..4: 21.9% 56.5% 21.6%
[libx264 @ 0x7f95f0001800] 8x8 transform intra:56.5%
[libx264 @ 0x7f95f0001800] coded y,uvDC,uvAC intra: 15.5% 0.6% 0.5%
[libx264 @ 0x7f95f0001800] i16 v,h,dc,p: 25% 75%  0%  0%
[libx264 @ 0x7f95f0001800] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 76%  9% 14%  0%  0%  0%  0%  0%  0%
[libx264 @ 0x7f95f0001800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 41% 27%  9%  2%  4%  4%  5%  3%  5%
[libx264 @ 0x7f95f0001800] i8c dc,h,v,p: 97%  3%  0%  0%
[libx264 @ 0x7f95f0001800] kb/s:39803.80
[aac @ 0x7f95f0003000] Qavg: 4519.618

但是如果我删除启用选项,形状会在整个​​视频持续时间内正确覆盖:

ffmpeg -i out.mp4 -i scaledSquare.png -filter_complex " [0:v][1] overlay=100:100" -y test2.mp4

ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with Apple LLVM version 9.0.0 (clang-900.0.39.2)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0.2 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'out.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:05.43, start: 0.000000, bitrate: 372 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 3360x2100 [SAR 1:1 DAR 8:5], 40214 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 73 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Input #1, png_pipe, from 'scaledSquare.png':
  Duration: N/A, bitrate: N/A
    Stream #1:0: Video: png, rgba(pc), 900x450 [SAR 1:2 DAR 1:1], 25 tbr, 25 tbn, 25 tbc
Stream mapping:
  Stream #0:0 (h264) -> overlay:main (graph 0)
  Stream #1:0 (png) -> overlay:overlay (graph 0)
  overlay (graph 0) -> Stream #0:0 (libx264)
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x7fb361801800] using SAR=1/1
[libx264 @ 0x7fb361801800] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7fb361801800] profile High, level 5.1
[libx264 @ 0x7fb361801800] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'test2.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 3360x2100 [SAR 1:1 DAR 8:5], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 69 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc58.18.100 aac
frame=    1 fps=0.0 q=28.0 Lsize=     244kB time=00:00:05.38 bitrate= 370.6kbits/s speed=20.8x    
video:193kB audio:49kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.744060%
[libx264 @ 0x7fb361801800] frame I:1     Avg QP:27.56  size:196459
[libx264 @ 0x7fb361801800] mb I  I16..4: 23.5% 55.3% 21.2%
[libx264 @ 0x7fb361801800] 8x8 transform intra:55.3%
[libx264 @ 0x7fb361801800] coded y,uvDC,uvAC intra: 15.4% 0.8% 0.6%
[libx264 @ 0x7fb361801800] i16 v,h,dc,p: 23% 76%  0%  0%
[libx264 @ 0x7fb361801800] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 79%  8% 12%  0%  0%  0%  0%  0%  1%
[libx264 @ 0x7fb361801800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 42% 27%  8%  2%  4%  4%  5%  3%  5%
[libx264 @ 0x7fb361801800] i8c dc,h,v,p: 95%  4%  1%  0%
[libx264 @ 0x7fb361801800] kb/s:39291.80
[aac @ 0x7fb361804a00] Qavg: 4519.618

我做了额外的测试:

ffmpeg -i out.mp4 -i scaledSquare.png -filter_complex " [0:v][1] overlay=100:100:enable='gte(t,0)'" -y test3.mp4

应该从大于 0 的所有时间添加形状,效果很好,在整个输出持续时间内添加形状。

所以我不确定为什么当我在此处指定类似 enable='gte(t,3)' 或 enable='between(t,1,3)' 时启用选项不起作用。

仅供了解更多信息: 我通过将 mp3 文件添加到一个 png 图像来创建原始 mp4 文件 (out.mp4),如下所示

ffmpeg -i images/08.png -i audio_123e4567-e89b-12d3-a456-426655440000.mp3 -pix_fmt yuv420p out.mp4

从图像+音频创建视频时,需要循环图像,否则视频开头只有 1 帧。

ffmpeg -loop 1 -i images/08.png -i audio_123e4567-e89b-12d3-a456-426655440000.mp3 -pix_fmt yuv420p -shortest out.mp4

现在 enable 将在叠加层中工作。