FFMPEG——使用 'amix' 将短音频剪辑与视频结合,导致最终视频的声音提前中断

FFMPEG -- using 'amix' to combine short audio clip with a video results in final video's sound cutting off early

我正在尝试结合以下内容:

(a) : 29 秒的视频剪辑,有自己的音频,持续整个持续时间

(b) :我想在视频开头播放的音频片段,与原始音频一起播放,时长约 2 秒

我成功地使用 'amix' 获得了一个带有组合音频的视频,但问题是最终视频的音频在 29 秒的 26 秒左右中断视频然后静音。

没有任何意义的是生成的视频可以正常播放,并且音频已成功混合。但是输出视频的音频流丢失了最后 3 秒。

这是我正在使用的 'amix' 命令(通过子进程发送):

subprocess.call(['ffmpeg','-i', input.mp4', '-i', "audioclip.mp3", '-filter_complex', 'amix', output.mp4'])

我还使用了拼写出 -map "0:a" 和 -map "1:a" 的此命令的版本,或者尝试使用 'amix=inputs=2:duration:longest' 以及许多其他添加项。所有这些都会导致相同的问题:最终组合视频的音频在视频中还剩 3 秒时中断,即使最初的 'input.mp4' 视频在 29 秒的音频中有完整的 29 秒。

有谁知道为什么 [a] 最后几秒的音频在最终视频中不见了?

_________________________________________________________________

编辑: 下面是我 运行 上面列出的 amix 命令时的输出:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advanced.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5441 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 5304 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : Bento4 Sound Handler
      vendor_id       : [0][0][0][0]
[mp3 @ 000001f0c8ec2040] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'TTS_clip.mp3':
  Duration: 00:00:01.90, start: 0.000000, bitrate: 32 kb/s
  Stream #1:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
Stream mapping:
  Stream #0:1 (aac) -> amix (graph 0)
  Stream #1:0 (mp3float) -> amix (graph 0)
  amix:default (graph 0) -> Stream #0:0 (aac)
  Stream #0:0 -> #0:1 (h264 (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[libx264 @ 000001f0c8cbe5c0] using SAR=1/1
[libx264 @ 000001f0c8cbe5c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001f0c8cbe5c0] profile High, level 4.0, 4:2:0, 8-bit
[libx264 @ 000001f0c8cbe5c0] 264 - core 164 r3094 bfc87b7 - H.264/MPEG-4 AVC codec - Copyleft 2003-2022 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=24 lookahead_threads=4 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'RuneBearinstakill_advancedwithtts.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Stream #0:0: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc59.25.100 aac
  Stream #0:1(eng): Video: h264 (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 30 fps, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.25.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=  893 fps=110 q=-1.0 Lsize=   18717kB time=00:00:29.66 bitrate=5168.5kbits/s speed=3.66x    
video:18256kB audio:433kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.150179%
[aac @ 000001f0c8f9ebc0] Qavg: 921.259
[libx264 @ 000001f0c8cbe5c0] frame I:4     Avg QP:21.33  size: 71366
[libx264 @ 000001f0c8cbe5c0] frame P:633   Avg QP:23.32  size: 23837
[libx264 @ 000001f0c8cbe5c0] frame B:256   Avg QP:25.22  size: 12968
[libx264 @ 000001f0c8cbe5c0] consecutive B-frames: 57.2% 10.3% 10.1% 22.4%
[libx264 @ 000001f0c8cbe5c0] mb I  I16..4: 17.9% 71.4% 10.8%
[libx264 @ 000001f0c8cbe5c0] mb P  I16..4:  6.9% 17.6%  0.8%  P16..4: 43.1%  6.5%  1.5%  0.0%  0.0%    skip:23.6%
[libx264 @ 000001f0c8cbe5c0] mb B  I16..4:  1.5%  4.2%  0.3%  B16..8: 39.7%  4.6%  0.5%  direct: 1.6%  skip:47.6%  L0:55.9% L1:41.8% BI: 2.3%
[libx264 @ 000001f0c8cbe5c0] 8x8 transform intra:69.5% inter:87.3%
[libx264 @ 000001f0c8cbe5c0] coded y,uvDC,uvAC intra: 35.6% 26.8% 0.8% inter: 13.4% 10.8% 0.0%
[libx264 @ 000001f0c8cbe5c0] i16 v,h,dc,p: 21% 37% 12% 30%
[libx264 @ 000001f0c8cbe5c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 26% 21%  4%  5%  5%  6%  4%  5%
[libx264 @ 000001f0c8cbe5c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 28% 15%  5%  7%  7%  7%  5%  4%
[libx264 @ 000001f0c8cbe5c0] i8c dc,h,v,p: 67% 18% 14%  1%
[libx264 @ 000001f0c8cbe5c0] Weighted P-Frames: Y:0.2% UV:0.0%
[libx264 @ 000001f0c8cbe5c0] ref P L0: 72.3% 15.4%  8.7%  3.6%  0.0%
[libx264 @ 000001f0c8cbe5c0] ref B L0: 88.9%  9.5%  1.6%
[libx264 @ 000001f0c8cbe5c0] ref B L1: 97.7%  2.3%
[libx264 @ 000001f0c8cbe5c0] kb/s:5024.13

这是我检查输入视频和输出视频的流持续时间时的输出,显示输出视频的音频流在混音:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advanced.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5403 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 5266 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : Bento4 Sound Handler
      vendor_id       : [0][0][0][0]
[STREAM]
duration=29.766667
[/STREAM]
[STREAM]
duration=29.738000
[/STREAM]

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'RuneBearinstakill_advancedwithtts.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.20.101
  Duration: 00:00:29.77, start: 0.000000, bitrate: 5098 kb/s
  Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt470bg/bt470bg/smpte170m, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 4971 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
    Metadata:
      handler_name    : Bento4 Video Handler
      vendor_id       : [0][0][0][0]
[STREAM]
duration=27.477000
[/STREAM]
[STREAM]
duration=29.766667

我找到了解决方法。结果我需要在 amix 命令的 filter_complex 中将输入视频的音频流设置为 aresample=1=async

aresample=aysnc=1

最终我的 amix 命令如下所示:

'[0:a]aresample=async=1[0a];[1:a]volume=2.0[1a];[0a][1a]amix=inputs=2'

我在超级用户的类似问题中找到了这种解决方案:https://superuser.com/questions/1234493/ffmpeg-amix-audio-to-video-with-some-audio-in-parts