ffmpeg concat 与使用图像背景的视频

ffmpeg concat with video using image background

Ffmpeg 无法使用各种测试正确连接媒体文件。其中一个视频是之前使用 .mp3 和 jpeg 背景生成的 .mp4(h264 编解码器)视频。我已经尝试使用各种标志进行测试,最接近的是下面的最终输出。

我的主要问题是当前测试的最终视频,当两个视频拼接在一起后,音频延迟了大约 3 秒。

以下是我正在使用的所有文件:

输入文件:

输出文件:

files.txt

file '/tmp/new_image_video.mp4'

file '/tmp/main_video.mp4'

图像视频创建:

ffmpeg -loop 1 -i /tmp/image.jpg -i /tmp/audio.mp3 -acodec libfdk_aac -framerate 30 -vcodec libx264 -shortest /tmp/new_image_video_raw.mp4

第二部分:

ffmpeg -threads 0 -i /tmp/new_image_video_raw.mp4 -vf "scale=w=560:h=320:force_original_aspect_ratio=decrease, pad=560:320:(560-iw*min(560/iw\,320/ih))/2:(320-ih*min(560/iw\,320/ih))/2" -acodec libfdk_aac -af aresample=resampler=soxr -qp 20 -ar 44100 -r 30 -ab 128k -ac 1 -vcodec libx264 -max_muxing_queue_size 9999 -shortest -movflags +faststart /tmp/new_image_video.mp4 -y


主视频转码:

ffmpeg -i /tmp/main_video_raw.mp4 -vf "scale=iw*min(560/iw\,320/ih):ih*min(560/iw\,320/ih), pad=560:320:(560-iw*min(560/iw\,320/ih))/2:(320-ih*min(560/iw\,320/ih))/2" -acodec libfdk_aac -af aresample=resampler=soxr -ar 44100 -aspect 16:9 -qp 20  -framerate 30 -ab 128k -ac 1 -vcodec libx264 -max_muxing_queue_size 9999 -movflags +faststart /tmp/main_video.mp4 -y


连接视频:

ffmpeg -threads 0 -f concat -safe 0 -i /tmp/files.txt -vf "scale=iw*min(560/iw\,320/ih):ih*min(560/iw\,320/ih), pad=560:320:(560-iw*min(560/iw\,320/ih))/2:(320-ih*min(560/iw\,320/ih))/2" -preset veryslow -crf 15 -acodec libfdk_aac -af aresample=resampler=soxr -ar 44100 -aspect 16:9 -qp 20  -framerate 30 -ab 128k -ac 1 -vcodec libx264 -max_muxing_queue_size 9999 -movflags +faststart /tmp/final_output_video.mp4 -y


new_image_video.mp4

的输出
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
[libx264 @ 0x150ce00] using SAR=1/1
[libx264 @ 0x150ce00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x150ce00] profile High, level 2.1
[libx264 @ 0x150ce00] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:-3:-3 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=2.00:0.70 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-4 threads=10 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=1 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.20
Output #0, mp4, to '/tmp/new_image_video.mp4':
  Metadata:
    encoder         : Lavf57.76.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuvj420p(pc), 560x320 [SAR 1:1 DAR 7:4], q=-1--1, 1 fps, 16384 tbn, 1 tbc
    Metadata:
      encoder         : Lavc57.102.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: mp3 (mp4a / 0x6134706D), 44100 Hz, stereo, s16p, 157 kb/s
    Metadata:
      encoder         : Lavc56.41
frame=   73 fps=0.0 q=17.0 Lsize=     362kB time=00:00:16.00 bitrate= 185.3kbits/s speed=88.6x
video:49kB audio:308kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.166542%
[libx264 @ 0x150ce00] frame I:1     Avg QP: 4.09  size: 38729
[libx264 @ 0x150ce00] frame P:18    Avg QP: 5.77  size:   843
[libx264 @ 0x150ce00] frame B:54    Avg QP: 0.64  size:    49
[libx264 @ 0x150ce00] consecutive B-frames:  1.4%  0.0%  0.0% 98.6%
[libx264 @ 0x150ce00] mb I  I16..4: 54.6% 18.9% 26.6%
[libx264 @ 0x150ce00] mb P  I16..4:  0.0%  0.0%  0.0%  P16..4:  9.1%  0.1%  0.5%  0.0%  0.0%    skip:90.3%
[libx264 @ 0x150ce00] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  2.6%  0.0%  0.0%  direct: 0.0%  skip:97.4%  L0:69.1% L1:30.9% BI: 0.0%
[libx264 @ 0x150ce00] 8x8 transform intra:18.9% inter:59.9%
[libx264 @ 0x150ce00] coded y,uvDC,uvAC intra: 44.1% 45.3% 45.0% inter: 1.4% 0.0% 0.0%
[libx264 @ 0x150ce00] i16 v,h,dc,p: 91%  2%  6%  1%
[libx264 @ 0x150ce00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 18% 18%  8%  5%  6%  7%  9%  7%
[libx264 @ 0x150ce00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 23% 16%  8%  7% 10%  9% 10%  9%  9%
[libx264 @ 0x150ce00] i8c dc,h,v,p: 71% 12% 12%  5%
[libx264 @ 0x150ce00] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x150ce00] ref P L0: 79.3%  0.1% 19.5%  1.1%
[libx264 @ 0x150ce00] ref B L0: 68.3% 30.5%  1.2%
[libx264 @ 0x150ce00] ref B L1: 98.4%  1.6%
[libx264 @ 0x150ce00] kb/s:6.20

new_image_video.mp4(第 2 部分)的输出

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/new_image_video_raw.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.76.100
  Duration: 00:00:19.00, start: 0.000000, bitrate: 156 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc), 560x320 [SAR 1:1 DAR 7:4], 21 kb/s, 1 fps, 1 tbr, 16384 tbn, 2 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: mp3 (mp4a / 0x6134706D), 44100 Hz, stereo, s16p, 157 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (mp3 (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[libx264 @ 0x2175560] using SAR=1/1
[libx264 @ 0x2175560] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x2175560] profile High, level 3.0
[libx264 @ 0x2175560] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=10 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc=cqp mbtree=0 qp=20 ip_ratio=1.40 pb_ratio=1.30 aq=0
Output #0, mp4, to '/tmp/new_image_video.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.76.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuvj420p(pc), 560x320 [SAR 1:1 DAR 7:4], q=-1--1, 30 fps, 15360 tbn, 30 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc57.102.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (libfdk_aac) (mp4a / 0x6134706D), 44100 Hz, mono, s16, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      encoder         : Lavc57.102.100 libfdk_aac
[mp4 @ 0x2150cc0] Starting second pass: moving the moov atom to the beginning of the file drop=0 speed=31.6x
frame=  569 fps=0.0 q=-1.0 Lsize=     351kB time=00:00:18.86 bitrate= 152.3kbits/s dup=579 drop=0 speed=32.5x
video:81kB audio:251kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 5.851973%
[libx264 @ 0x2175560] frame I:3     Avg QP:17.00  size: 23393
[libx264 @ 0x2175560] frame P:143   Avg QP:20.00  size:    26
[libx264 @ 0x2175560] frame B:423   Avg QP:21.67  size:    19
[libx264 @ 0x2175560] consecutive B-frames:  0.9%  0.0%  0.0% 99.1%
[libx264 @ 0x2175560] mb I  I16..4: 54.7% 26.0% 19.4%
[libx264 @ 0x2175560] mb P  I16..4:  0.0%  0.0%  0.0%  P16..4:  0.1%  0.0%  0.0%  0.0%  0.0%    skip:99.9%
[libx264 @ 0x2175560] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  0.2%  0.0%  0.0%  direct: 0.0%  skip:99.7%  L0:23.7% L1:76.3% BI: 0.0%
[libx264 @ 0x2175560] 8x8 transform intra:26.0% inter:14.0%
[libx264 @ 0x2175560] coded y,uvDC,uvAC intra: 39.8% 44.1% 43.4% inter: 0.0% 0.0% 0.0%
[libx264 @ 0x2175560] i16 v,h,dc,p: 91%  3%  5%  1%
[libx264 @ 0x2175560] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 17% 15%  8%  6%  9%  6%  8%  8%
[libx264 @ 0x2175560] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 20% 17%  6%  7% 10%  9% 11% 10%  9%
[libx264 @ 0x2175560] i8c dc,h,v,p: 71% 11% 13%  5%
[libx264 @ 0x2175560] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x2175560] ref P L0: 95.4%  0.7%  3.9%
[libx264 @ 0x2175560] ref B L0: 44.6% 55.4%
[libx264 @ 0x2175560] ref B L1: 98.3%  1.7%
[libx264 @ 0x2175560] kb/s:34.62

main_video.mp4

的输出
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/main_video_raw.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    creation_time   : 1970-01-01T00:00:00.000000Z
    encoder         : Lavf53.24.2
  Duration: 00:01:02.32, start: 0.000000, bitrate: 1347 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 959 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 383 kb/s (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (aac (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[libx264 @ 0x758900] using SAR=64/63
[libx264 @ 0x758900] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x758900] profile High, level 2.1
[libx264 @ 0x758900] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=10 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc=cqp mbtree=0 qp=20 ip_ratio=1.40 pb_ratio=1.30 aq=0
Output #0, mp4, to '/tmp/main_video.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.76.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 560x320 [SAR 64:63 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : VideoHandler
      encoder         : Lavc57.102.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (libfdk_aac) (mp4a / 0x6134706D), 44100 Hz, mono, s16, 128 kb/s (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : SoundHandler
      encoder         : Lavc57.102.100 libfdk_aac
[mp4 @ 0x755900] Starting second pass: moving the moov atom to the beginning of the file11.1x
frame= 1557 fps=275 q=-1.0 Lsize=    5144kB time=00:01:02.32 bitrate= 676.1kbits/s speed=  11x
video:4119kB audio:975kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.989500%
[libx264 @ 0x758900] frame I:13    Avg QP:17.00  size: 34937
[libx264 @ 0x758900] frame P:657   Avg QP:20.00  size:  3546
[libx264 @ 0x758900] frame B:887   Avg QP:21.69  size:  1615
[libx264 @ 0x758900] consecutive B-frames: 18.9% 12.6%  8.1% 60.4%
[libx264 @ 0x758900] mb I  I16..4: 12.5% 51.8% 35.7%
[libx264 @ 0x758900] mb P  I16..4:  0.2%  1.9%  1.0%  P16..4: 17.9%  9.3%  8.4%  0.0%  0.0%    skip:61.3%
[libx264 @ 0x758900] mb B  I16..4:  0.1%  0.3%  0.3%  B16..8: 18.0%  5.6%  2.4%  direct: 2.8%  skip:70.6%  L0:33.9% L1:42.5% BI:23.6%
[libx264 @ 0x758900] 8x8 transform intra:55.4% inter:56.4%
[libx264 @ 0x758900] coded y,uvDC,uvAC intra: 84.0% 93.3% 75.0% inter: 12.6% 14.9% 3.3%
[libx264 @ 0x758900] i16 v,h,dc,p:  8% 38%  3% 51%
[libx264 @ 0x758900] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 20%  8%  7%  9%  9% 10% 10% 11%
[libx264 @ 0x758900] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 20%  9%  8% 11% 10% 10%  9%  9%
[libx264 @ 0x758900] i8c dc,h,v,p: 41% 26% 17% 16%
[libx264 @ 0x758900] Weighted P-Frames: Y:0.2% UV:0.0%
[libx264 @ 0x758900] ref P L0: 72.3% 14.4%  9.7%  3.6%  0.0%
[libx264 @ 0x758900] ref B L0: 89.9%  7.7%  2.4%
[libx264 @ 0x758900] ref B L1: 97.1%  2.9%
[libx264 @ 0x758900] kb/s:541.66

concat 输出:

ffmpeg version 3.3.3 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-18)
  configuration: --prefix=/root/ffmpeg_build --extra-cflags=-I/root/ffmpeg_build/include --extra-ldflags='-L/root/ffmpeg_build/lib -ldl' --bindir=/root/bin --pkg-config-flags=--static --enable-gpl --enable-version3 --disable-debug --enable-shared --enable-runtime-cpudetect --enable-postproc --enable-pic --enable-libfdk_aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvo-amrwbenc --enable-gray --enable-libopenjpeg --enable-libass --enable-libvidstab --enable-libsoxr --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-libwebp --enable-fontconfig --enable-libspeex --enable-nonfree
  libavutil      55. 73.100 / 55. 73.100
  libavcodec     57.102.100 / 57.102.100
  libavformat    57. 76.100 / 57. 76.100
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 98.100 /  6. 98.100
  libswscale      4.  7.102 /  4.  7.102
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/tmp/main_video.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    creation_time   : 1970-01-01T00:00:00.000000Z
    encoder         : Lavf53.24.2
  Duration: 00:01:02.32, start: 0.000000, bitrate: 1347 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 959 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 383 kb/s (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (aac (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[libx264 @ 0x1563900] using SAR=64/63
[libx264 @ 0x1563900] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x1563900] profile High, level 2.1
[libx264 @ 0x1563900] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=10 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc=cqp mbtree=0 qp=20 ip_ratio=1.40 pb_ratio=1.30 aq=0
Output #0, mp4, to '/tmp/new_image_video.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.76.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 560x320 [SAR 64:63 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : VideoHandler
      encoder         : Lavc57.102.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (libfdk_aac) (mp4a / 0x6134706D), 44100 Hz, mono, s16, 128 kb/s (default)
    Metadata:
      creation_time   : 1970-01-01T00:00:00.000000Z
      handler_name    : SoundHandler
      encoder         : Lavc57.102.100 libfdk_aac
[mp4 @ 0x1560900] Starting second pass: moving the moov atom to the beginning of the file1.2x
frame= 1557 fps=277 q=-1.0 Lsize=    5144kB time=00:01:02.32 bitrate= 676.1kbits/s speed=11.1x
video:4119kB audio:975kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.989500%
[libx264 @ 0x1563900] frame I:13    Avg QP:17.00  size: 34937
[libx264 @ 0x1563900] frame P:657   Avg QP:20.00  size:  3546
[libx264 @ 0x1563900] frame B:887   Avg QP:21.69  size:  1615
[libx264 @ 0x1563900] consecutive B-frames: 18.9% 12.6%  8.1% 60.4%
[libx264 @ 0x1563900] mb I  I16..4: 12.5% 51.8% 35.7%
[libx264 @ 0x1563900] mb P  I16..4:  0.2%  1.9%  1.0%  P16..4: 17.9%  9.3%  8.4%  0.0%  0.0%    skip:61.3%
[libx264 @ 0x1563900] mb B  I16..4:  0.1%  0.3%  0.3%  B16..8: 18.0%  5.6%  2.4%  direct: 2.8%  skip:70.6%  L0:33.9% L1:42.5% BI:23.6%
[libx264 @ 0x1563900] 8x8 transform intra:55.4% inter:56.4%
[libx264 @ 0x1563900] coded y,uvDC,uvAC intra: 84.0% 93.3% 75.0% inter: 12.6% 14.9% 3.3%
[libx264 @ 0x1563900] i16 v,h,dc,p:  8% 38%  3% 51%
[libx264 @ 0x1563900] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 20%  8%  7%  9%  9% 10% 10% 11%
[libx264 @ 0x1563900] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 20%  9%  8% 11% 10% 10%  9%  9%
[libx264 @ 0x1563900] i8c dc,h,v,p: 41% 26% 17% 16%
[libx264 @ 0x1563900] Weighted P-Frames: Y:0.2% UV:0.0%
[libx264 @ 0x1563900] ref P L0: 72.3% 14.4%  9.7%  3.6%  0.0%
[libx264 @ 0x1563900] ref B L0: 89.9%  7.7%  2.4%
[libx264 @ 0x1563900] ref B L1: 97.1%  2.9%
[libx264 @ 0x1563900] kb/s:541.66

main_video.mp4 音轨似乎是可变的。我能够通过对视频进行转码来使其正常工作:

ffmpeg -i /tmp/main_video_raw.mp4 -vf "scale=iw*min(560/iw\,320/ih):ih*min(560/iw\,320/ih), pad=560:320:(560-iw*min(560/iw\,320/ih))/2:(320-ih*min(560/iw\,320/ih))/2" -acodec libfdk_aac -af aresample=resampler=soxr -ar 44100 -aspect 16:9 -qp 20 -framerate 30 -ab 128k -ac 1 -vcodec libx264 -x264-params "nal-hrd=cbr" -b:v 2500K -minrate 2500K -maxrate 2500K -bufsize 2M -shortest -max_muxing_queue_size 9999 -movflags +faststart /tmp/main_video.mp4 -y