FFMPEG队列输入时间倒退

FFMPEG Queue input backward in time

我正在尝试合并两个音频文件,并延迟第二个文件。这是我的命令

ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka -i RT103bfe5f4b129860f69cd8e820f3a10b.mka -filter_complex "[1:a]adelay=13500s:all=1[apad]; [0:a][apad]amix=inputs=2:weights=1|1[aout]" -map [aout] combined_audio.mka

这是我得到的输出,它导致了第二个音频延迟 5 小时 45 分钟而不是 3 小时 45 分钟的问题

 ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka -i RT103bfe5f4b129860f69cd8e820f3a10b.mka -filter_complex "[1:a]adelay=13500s:all=1[apad]; [0:a][apad]amix=inputs=2:weights=1|1[aout]" -map [aout] combined_audio.mka
ffmpeg version n5.0-4-g911d7f167c-20220311 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 11.2.0 (crosstool-NG 1.24.0.533_681aaef)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librist --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20220311
  libavutil      57. 17.100 / 57. 17.100
  libavcodec     59. 18.100 / 59. 18.100
  libavformat    59. 16.100 / 59. 16.100
  libavdevice    59.  4.100 / 59.  4.100
  libavfilter     8. 24.100 /  8. 24.100
  libswscale      6.  4.100 /  6.  4.100
  libswresample   4.  3.100 /  4.  3.100
  libpostproc    56.  3.100 / 56.  3.100
Input #0, matroska,webm, from 'RTb295d0534191e1acb22a45bb971a12e6.mka':
  Metadata:
    encoder         : GStreamer matroskamux version 1.16.2
    creation_time   : 2022-03-23T21:20:27.000000Z
  Duration: 03:45:00.47, start: 0.291000, bitrate: 19 kb/s
  Stream #0:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      title           : Audio
Input #1, matroska,webm, from 'RT103bfe5f4b129860f69cd8e820f3a10b.mka':
  Metadata:
    encoder         : GStreamer matroskamux version 1.16.2
    creation_time   : 2022-03-24T01:05:30.000000Z
  Duration: 02:45:03.51, start: 13502.587000, bitrate: 5 kb/s
  Stream #1:0(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      title           : Audio
Stream mapping:
  Stream #0:0 (opus) -> amix
  Stream #1:0 (opus) -> adelay:default
  amix:default -> Stream #0:0 (libvorbis)
Press [q] to stop, [?] for help
Output #0, matroska, to 'combined_audio.mka':
  Metadata:
    encoder         : Lavf59.16.100
  Stream #0:0: Audio: vorbis (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp
    Metadata:
      encoder         : Lavc59.18.100 libvorbis
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time231x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time184x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time189x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time223x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time275x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time245x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time213x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time209x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time208x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time204x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time199x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time193x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time185x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time181x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time178x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time177x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time176x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time169x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time167x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time163x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time146x
[libvorbis @ 00000229f8a7bbc0] Queue input is backward in time139x
size=   75141kB time=06:07:52.57 bitrate=  27.9kbits/s speed= 130x
video:0kB audio:70470kB subtitle:0kB other streams:0kB global headers:4kB muxing overhead: 6.628071%

音频文件正在混合在一起 - https://www.easyupload.io/m/durisk

我该如何解决这个问题?

这些音频文件的根本问题似乎是频繁丢帧(每个包含 960 个音频样本)。第一个文件中的 2 个连续帧之间存在 8117 秒 间隔的实例。因为 MKA 文件是在没有填充这些丢失的帧的情况下形成的,所以它们实际上是 variable-sampling-rate 流,同时标记为 constant-sampling-rate。这种差异使您的音频看起来比录制的要短,这解释了为什么您的输出通常比预期的要长得多,并且一直在破坏您对这些文件的处理。

虽然atm我不知道FFmpeg是否提供了一种机制来fix/estimate这些文件中的丢帧,但是你可以brute-force/ignore丢帧通过:

混合

ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
       -i RT103bfe5f4b129860f69cd8e820f3a10b.mka \
       -filter_complex "[1:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,adelay=13500s:all=1[apad]; \
                        [0:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,[apad]amix=inputs=2:weights=1|1[aout]" \
-map [aout] combined_audio.mka

连接

看来您的过滤图实际上只是通过将第二个流延迟第一个的持续时间来连接两个流。您可以改为执行以下操作:

ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
       -i RT103bfe5f4b129860f69cd8e820f3a10b.mka \
       -filter_complex "[1:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB,[0a]concat=2:0:1[aout]; \
                        [0:a]asetpts=NB_CONSUMED_SAMPLES/SR/TB[0a]" \
       -map [aout] combined_audio.mka

说明

asetpts 过滤器用于完全忽略文件关于将帧馈送到过滤器图的时间的内容,并使用以下变量和公式重新计算每帧的新 PTS:

  • NB_CONSUMED_SAMPLES 处理的样本数
  • SR:采样率(samples/second)
  • TB: 时基(秒)
  • NB_CONSUMED_SAMPLES/SR/TB:新PTS(起始TB块索引)

视频流

如果您的视频文件有同样的问题,您同样可以使用setpts过滤器:

setpts=N/FR/TB

填充丢失的帧

aresample can be used to fill the missing frames with zero-valued samples (silence). (。你可以试试看每个mka文件会发生什么:

ffmpeg -i RTb295d0534191e1acb22a45bb971a12e6.mka \
       -af "aresample=async=1 \
       patched_audio.mka

注意:这可能会使流变长 (low-pitched?) buzzing/beeping,使其无法收听。但是,您可能需要这样做才能将它们同步到视频。重采样器可以拉伸周围的样本,因此这可能是适合您的解决方案。请参阅 the documentation 了解 asyncmin_compmin_hard_compcomp_durationmax_soft_comp 选项。