FFMPEG - 多轨、多通道文件到离散单声道文件

FFMPEG - Multi Track, Multi Channel file to discrete mono files

我有多音轨和多声道的文件(即音轨 1 可能是 5.1,音轨 2 可能是立体声,音轨 3 可能是立体声等)

我希望将每个轨道的每个通道输出到它自己的 'unrolled' 离散单声道文件中。

示例媒体:

ffprobe version 4.3.1-0york0~18.04 Copyright (c) 2007-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version='0york0~18.04' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libzimg --enable-pocketsphinx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
[mxf @ 0x55d3e7fc2680] wrapping of stream 0 is unknown
[jpeg2000 @ 0x55d3e805ce00] End mismatch 1
    Last message repeated 1 times
Input #0, mxf, from 'redacted.mxf':
  Metadata:
    operational_pattern_ul: 060e2b34.04010101.0d010201.01010900
    modification_date: 2019-10-03T09:58:16.368000Z
    uid             : f6267ae2-680e-4357-9b1d-c77c045d3cd7
    generation_uid  : e7e6f5a1-6f15-4df5-aea8-a41f3ef535d6
    company_name    : redacted
    product_name    : redacted
    product_version : 11.6.1.5.301404
    product_uid     : 84ae5ffc-4710-11dd-a6fe-0010c629ec73
    application_platform: 4KICR1
    material_package_umid: 0x060A2B340101010501010D2013000000BE3608F3135E48AD99E4340643E47F22
    timecode        : 00:59:20:00
  Duration: 00:26:16.07, start: 0.000000, bitrate: 139194 kb/s
    Stream #0:0: Video: jpeg2000, yuv422p10le(progressive), 1920x1080, SAR 1:1 DAR 16:9, 23.98 tbr, 23.98 tbn, 23.98 tbc
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Picture
    Stream #0:1: Audio: pcm_s24le, 48000 Hz, 6 channels, s32 (24 bit), 6912 kb/s
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Sound
    Stream #0:2: Audio: pcm_s24le, 48000 Hz, 2 channels, s32 (24 bit), 2304 kb/s
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Sound
    Stream #0:3: Audio: pcm_s24le, 48000 Hz, 2 channels, s32 (24 bit), 2304 kb/s
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Sound
    Stream #0:4: Audio: pcm_s24le, 48000 Hz, 2 channels, s32 (24 bit), 2304 kb/s
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Sound
    Stream #0:5: Data: none
    Metadata:
      file_package_umid: 0x060A2B340101010501010D201300000091A43E578B86490698045924FA9EECC5
      track_name      : Auxiliary Data
      data_type       : vbi_vanc_smpte_436M
Unsupported codec with id 0 for input stream 5

这些文件是供应商认证的母版,轨道/通道组合因供应商而异,因此有些可能是立体声、5.1、7.1 顺序,有些可能已经是离散单声道,有些可能是离散立体声、5.1 和单声道。都是混合体。所以我正在寻找一些通用策略来优雅地处理来自所有轨道的所有频道。

现在我看到了各种通过 ffmpeg 文档处理音频离散化的策略,但其中 none 似乎展示了如何针对不同轨道的不同通道。我确定这是一个 pebkac 错误,但我希望得到一些指导。

我已经尝试了 map_channel 方法和 -filtercomplex channelsplit 方法。

ffmpeg -i redacted.mxf -ss 60 \
-map_channel 0.1.0 -t 10 track_1_0.wav \
-map_channel 0.1.1 -t 10 track_1_1.wav \
-map_channel 0.1.2 -t 10 track_1_2.wav \
-map_channel 0.1.3 -t 10 track_1_3.wav \
-map_channel 0.1.4 -t 10 track_1_4.wav \
-map_channel 0.1.5 -t 10 track_1_5.wav \
-map_channel 0.2.0 -t 10 track_2_0.wav \
-map_channel 0.2.1 -t 10 track_2_1.wav \
-map_channel 0.3.0 -t 10 track_3_0.wav \
-map_channel 0.3.1 -t 10 track_3_1.wav \
-map_channel 0.4.0 -t 10 track_4_0.wav \
-map_channel 0.4.1 -t 10 track_4_1.wav 

然而,输出文件并非都是单声道,有些被标记为 5.1。我不相信他们继承了一个健全/正确的通道布局(单声道)——但是标记为 5.1 的输出文件是荒谬的,因为它们都来自立体声轨道。即 track_2_0.wav track_2_1.wav、track_3_0.wav、track_3_1.wav、track_4_0.wav、track_4_1.wav。这看起来很奇怪。 Track 1_0 从上面的命令输出一个正常的媒体信息:

File size                                : 938 KiB
Duration                                 : 10s 0ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 768 Kbps
Writing application                      : Lavf58.45.100

Audio
Format                                   : PCM
Format settings                          : Little / Signed
Codec ID                                 : 1
Duration                                 : 10s 0ms
Bit rate mode                            : Constant
Bit rate                                 : 768 Kbps
Channel(s)                               : 1 channel
Sampling rate                            : 48.0 KHz
Bit depth                                : 16 bits
Stream size                              : 938 KiB (100%)

然而,第二和第三首曲目有错误的频道布局和意外的编解码器 ID:

Format                                   : Wave
File size                                : 5.49 MiB
Duration                                 : 10s 0ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 4 608 Kbps
Writing application                      : Lavf58.45.100

Audio
Format                                   : PCM
Format settings                          : Little / Signed
Codec ID                                 : 00000001-0000-0010-8000-00AA00389B71
Duration                                 : 10s 0ms
Bit rate mode                            : Constant
Bit rate                                 : 4 608 Kbps
Channel(s)                               : 6 channels
Channel layout                           : L R C LFE Lb Rb
Sampling rate                            : 48.0 KHz
Bit depth                                : 16 bits
Stream size                              : 5.49 MiB (100%)

另外回复:map_channel,有一些文档对它的正确方法表示怀疑:

Note that currently each output stream can only contain channels from a single input stream; you can’t for example use "-map_channel" to pick multiple input audio channels contained in different streams (from the same or different files) and merge them into a single output stream. It is therefore not currently possible, for example, to turn two separate mono streams into a single stereo stream. However splitting a stereo stream into two single channel mono streams is possible.

使用复合滤波器,docs/bug 跟踪器有一个离散化 5.1 和标记单声道的示例。我可以定位我想要的曲目,并获得有效的过滤器链,如调试日志报告中所示,但我只获得第一首曲目的音频:

ffmpeg -y -v 40 -i redacted.mxf -ss 60 \
    -disposition:a default \
    -filter_complex \
    "[0:a:0]channelsplit=channel_layout=5.1[c1][c2][c3][c4][c5][c6],\
    [c1]aformat=channel_layouts=mono[c1],\
    [c2]aformat=channel_layouts=mono[c2],\
    [c3]aformat=channel_layouts=mono[c3],\
    [c4]aformat=channel_layouts=mono[c4],\
    [c5]aformat=channel_layouts=mono[c5],\
    [c6]aformat=channel_layouts=mono[c6],\
    [0:a:1]channelsplit=channel_layout=stereo[c7][c8],\
    [c7]aformat=channel_layouts=mono[c7],\
    [c8]aformat=channel_layouts=mono[c8],\
    [0:a:2]channelsplit=channel_layout=stereo[c9][c10],\
    [c9]aformat=channel_layouts=mono[c9],\
    [c10]aformat=channel_layouts=mono[c10],\
    [0:a:3]channelsplit=channel_layout=stereo[c11][c12],\
    [c11]aformat=channel_layouts=mono[c11],\
    [c12]aformat=channel_layouts=mono[c12]"\
     -map  "[c1]" -t 10 1.wav\
     -map  "[c2]" -t 10 2.wav\
     -map  "[c3]" -t 10 3.wav\
     -map  "[c4]" -t 10 4.wav\
     -map  "[c5]" -t 10 5.wav\
     -map  "[c6]" -t 10 6.wav\
     -map  "[c7]" -t 10 7.wav\
     -map  "[c8]" -t 10 8.wav\
     -map  "[c9]" -t 10 9.wav\
     -map  "[c10]" -t 10 10.wav\
     -map  "[c11]" -t 10 11.wav\
     -map  "[c12]" -t 10 12.wav

TL/DR;

简而言之,如何将每个轨道的每个通道导出为离散的单声道音频轨道(不管通道布局如何?)

谢谢!

您不能重复使用过滤器输出中的标签。使用中间标签。

ffmpeg -y -v 40 -i redacted.mxf -ss 60 \
    -disposition:a default \
    -filter_complex \
    "[0:a:0]channelsplit=channel_layout=5.1[a1][a2][a3][a4][a5][a6],\
    [a1]aformat=channel_layouts=mono[c1],\
    [a2]aformat=channel_layouts=mono[c2],\
    [a3]aformat=channel_layouts=mono[c3],\
    [a4]aformat=channel_layouts=mono[c4],\
    [a5]aformat=channel_layouts=mono[c5],\
    [a6]aformat=channel_layouts=mono[c6],\
    [0:a:1]channelsplit=channel_layout=stereo[a7][a8],\
    [a7]aformat=channel_layouts=mono[c7],\
    [a8]aformat=channel_layouts=mono[c8],\
    [0:a:2]channelsplit=channel_layout=stereo[a9][a10],\
    [a9]aformat=channel_layouts=mono[c9],\
    [a10]aformat=channel_layouts=mono[c10],\
    [0:a:3]channelsplit=channel_layout=stereo[a11][a12],\
    [a11]aformat=channel_layouts=mono[c11],\
    [a12]aformat=channel_layouts=mono[c12]"\
     -map  "[c1]" -t 10 1.wav\
     -map  "[c2]" -t 10 2.wav\
     -map  "[c3]" -t 10 3.wav\
     -map  "[c4]" -t 10 4.wav\
     -map  "[c5]" -t 10 5.wav\
     -map  "[c6]" -t 10 6.wav\
     -map  "[c7]" -t 10 7.wav\
     -map  "[c8]" -t 10 8.wav\
     -map  "[c9]" -t 10 9.wav\
     -map  "[c10]" -t 10 10.wav\
     -map  "[c11]" -t 10 11.wav\
     -map  "[c12]" -t 10 12.wav