使用 ffmpeg 将封面添加到包含 opus 音频流的 ogg 中,而无需重新编码音频流

Add coverart into ogg containing an opus audio stream with ffmpeg without re-encoding the audio stream

我正在尝试使用 ffmpeg 将封面添加到 ogg 文件中:

这是我的 source.oggsource.jpg 文件:

$ ffprobe -hide_banner source.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
$ identify source.jpg 
source.jpg JPEG 480x360 480x360+0+0 8-bit DirectClass 15.1KB 0.000u 0:00.000

我试过了:

$ ffmpeg -hide_banner -i source.ogg -i source.jpg -map 0 -map 1 -c:a copy -c copy -map_metadata 0 dest.ogg -y && echo && ffprobe -hide_banner dest.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
Input #1, image2, from 'source.jpg':
  Duration: 00:00:00.04, start: 0.000000, bitrate: 3023 kb/s
    Stream #1:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 480x360 [SAR 1:1 DAR 4:3], 25 tbr, 25 tbn, 25 tbc
[ogg @ 0x5655578064c0] Unsupported codec id in stream 1
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #1:0 -> #0:1 (copy)
    Last message repeated 1 times
[ogg @ 0x5655577e8540] Format ogg detected only with low score of 1, misdetection possible!
dest.ogg: End of file

我也找到了 ,但它没有解释如何使用 ffmpeg

我读到 ogg 容器中的 "METADATA_BLOCK_PICTURE" 元数据可能包含 base64 格式的图片,所以我尝试了这个:

$ ffmpeg -hide_banner -i source.ogg -map 0 -c:a copy -c copy -metadata METADATA_BLOCK_PICTURE="$(base64 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                    : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                    ..............................................................................
                    : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                    : KiiCWRRRRBJ//9k=
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                      : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                      : Y2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQED
                      ..............................................................................
                      : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                      : KiiCWRRRRBJ//9k=
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.01e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.100392%

有点"worked",但是ffplaympv都无法解析封面艺术:

$ ffplay -hide_banner dest.ogg
[ogg @ 0x5655577e8540] Failed to parse cover art block.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
   3.95 M-A: -0.000 fd=   0 aq=   14KB vq=    0KB sq=    0B f=0/0    
$ mpv dest.ogg 
Playing: dest.ogg
[ffmpeg/demuxer] ogg: Failed to parse cover art block.
 (+) Audio --aid=1 (opus 2ch 48000Hz)
AO: [pulse] 48000Hz stereo 2ch float
A: 00:00:03 / 00:03:02 (2%)


Exiting... (Quit)

我还尝试了 -metadata:s:a 以及 base64--wrap 0(我忘记指定了,糟糕 :)):

$ ffmpeg -i source.ogg -map 0 -c:a copy -c copy -metadata:s:a METADATA_BLOCK_PICTURE="$(base64 --wrap 0 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQEDEQH/xAAaAAACAwEBAAAAAAAAAAA
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.22e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.084397%

但是 dest.ogg jpg 封面图仍然无法正常读取:

$ ffprobe -hide_banner dest.ogg 
[ogg @ 0x5655577e8540] Invalid picture type: -2555936.
[ogg @ 0x5655577e8540] Could not read mimetype from an attached picture.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100

你能帮帮我吗?

这对我有用:

ffmpeg -i mysong.ogg -i coverart.jpg song_with_art.ogg

FFmpeg 4.4 版自动支持使用 Theora 视频编解码器将专辑封面嵌入到 Ogg 容器中(有关支持的编解码器列表,请参阅维基百科上的“Ogg codecs”,尽管 FFmpeg 可能不支持所有编解码器)。

这与 MP3 文件不同,MP3 文件将专辑封面存储为特殊用途标签中的二进制编码字符串。这允许媒体播放器将其正确检测为音频文件(例如使用 mpv--audio-display 选项)并防止在播放期间重绘帧。 Ogg 容器不支持此功能,因此 FFmpeg 只是将常规视频流添加到文件中。此视频流的帧速率设置(至少对于 JPEG)为 90000,导致无害的警告。

至少 mpv 不会降低性能,它只会在屏幕刷新率允许的情况下以最快的速度重绘。视频流中只有一帧被编码,可以通过运行ffprobe -v error -select_streams v:0 -count_packets -show_entries stream=nb_read_packets -of csv=p=0 input.ogg手动验证in this answer。如果需要,可以使用 -r:v 1 选项将帧速率手动设置为 1。请参阅评论以进行更多讨论。

下面是一个将带有包含专辑封面的视频轨道的 MP3 文件转换为带有 Opus 编码音频和 Theora 编码视频的 Ogg 文件的示例:

$ ffprobe -hide_banner '01 - State of Grace.mp3' 
[mp3 @ 0x5594cbafe320] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
$ ffmpeg -hide_banner -i '01 - State of Grace.mp3' -c:a libopus -b:a 128000 -c:v libtheora -q:v 10 '01 - State of Grace.ogg'
[mp3 @ 0x55ebe6d3cc40] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
Stream mapping:
  Stream #0:1 -> #0:0 (mjpeg (native) -> theora (libtheora))
  Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus))
Press [q] to stop, [?] for help
[swscaler @ 0x55ebe6db69e0] deprecated pixel format used, make sure you did set range correctly
[ogg @ 0x55ebe6d44c80] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
Output #0, ogg, to '01 - State of Grace.ogg':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
    encoder         : Lavf58.76.100
  Stream #0:0: Video: theora, yuv444p(tv, bt470bg/unknown/unknown, progressive), 600x600 [SAR 1:1 DAR 1:1], q=2-31, 200 kb/s, 90k fps, 90k tbn (attached pic)
    Metadata:
      title           : Cover
      DESCRIPTION     : Cover (front)
      encoder         : Lavc58.134.100 libtheora
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
  Stream #0:1: Audio: opus, 48000 Hz, stereo, flt, 128 kb/s
    Metadata:
      encoder         : Lavc58.134.100 libopus
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      title           : State of Grace
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
      DESCRIPTION     : Taylor Swift
[mp3float @ 0x55ebe6d96360] Header missing time=00:04:31.63 bitrate=   0.1kbits/s speed=59.8x    64x    
Error while decoding stream #0:0: Invalid data found when processing input
frame=    1 fps=0.2 q=-0.0 Lsize=    4929kB time=00:04:55.79 bitrate= 136.5kbits/s speed=59.8x    
video:58kB audio:4830kB subtitle:0kB other streams:0kB global headers:3kB muxing overhead: 0.845459%
$ mpv '01 - State of Grace.ogg'
 (+) Video --vid=1 'Cover' (theora 600x600)
 (+) Audio --aid=1 'State of Grace' (opus 2ch 48000Hz)
AO: [alsa] 48000Hz stereo 2ch float
VO: [gpu] 600x600 yuv444p
(Paused) AV: -00:00:00 / 00:04:55 (0%)

Exiting... (Quit)
$ 

请注意,-q:v 10 Theora video codec option 用于尽可能高的视频质量。如果没有这个选项,默认情况下专辑封面的分辨率极低,并且使用最高质量时的大小差异可以忽略不计,因为只对单个帧进行编码。

这需要使用 libtheora(以及用于 Opus 编码音频的 libopus)构建 FFmpeg。这是 ffmpeg -codecs 的输出,删除了不相关的编解码器并改进了格式:

$ ffmpeg -codecs
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11.1.0
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64
  --docdir=/usr/share/doc/ffmpeg-4.4.1-r1/html --mandir=/usr/share/man
  --enable-shared --cc=x86_64-pc-linux-gnu-gcc
  --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar
  --nm=x86_64-pc-linux-gnu-nm --ranlib=x86_64-pc-linux-gnu-ranlib
  --pkg-config=x86_64-pc-linux-gnu-pkg-config --optflags='-O2 -pipe
  -march=native -ggdb3' --extra-libs= --enable-static --enable-avfilter
  --enable-avresample --disable-stripping --disable-optimizations
  --disable-libcelt --enable-nonfree --disable-indev=v4l2
  --disable-outdev=v4l2 --disable-indev=oss --disable-indev=jack
  --disable-indev=sndio --disable-outdev=oss --disable-outdev=sndio
  --enable-bzlib --enable-runtime-cpudetect --disable-debug
  --disable-gcrypt --enable-gnutls --disable-gmp --enable-gpl
  --disable-hardcoded-tables --enable-iconv --disable-libxml2 --enable-lzma
  --enable-network --disable-opencl --enable-openssl --enable-postproc
  --disable-libsmbclient --disable-ffplay --disable-sdl2 --disable-vaapi
  --disable-vdpau --disable-vulkan --enable-xlib --enable-libxcb
  --enable-libxcb-shm --enable-libxcb-xfixes --enable-zlib
  --disable-libcdio --disable-libiec61883 --disable-libdc1394
  --disable-libcaca --enable-openal --enable-opengl --disable-libv4l2
  --disable-libpulse --disable-libdrm --disable-libjack
  --disable-libopencore-amrwb --disable-libopencore-amrnb
  --disable-libcodec2 --enable-libdav1d --disable-libfdk-aac
  --disable-libopenjpeg --disable-libbluray --disable-libgme
  --disable-libgsm --disable-libaribb24 --disable-mmal --disable-libmodplug
  --enable-libopus --disable-libilbc --disable-librtmp --disable-libssh
  --disable-libspeex --disable-libsrt --disable-librsvg --disable-ffnvcodec
  --disable-libvorbis --disable-libvpx --disable-libzvbi --disable-appkit
  --disable-libbs2b --disable-chromaprint --disable-cuda-llvm
  --disable-libflite --disable-frei0r --disable-libfribidi
  --enable-fontconfig --disable-ladspa --disable-libass
  --disable-libtesseract --disable-lv2 --disable-libfreetype
  --disable-libvidstab --disable-librubberband --disable-libzmq
  --disable-libzimg --disable-libsoxr --enable-pthreads
  --disable-libvo-amrwbenc --disable-libmp3lame --disable-libkvazaar
  --enable-libaom --disable-libopenh264 --disable-librav1e
  --disable-libsnappy --enable-libtheora --disable-libtwolame
  --disable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid
  --disable-gnutls --disable-armv5te --disable-armv6 --disable-armv6t2
  --disable-neon --disable-vfp --disable-vfpv3 --disable-armv8
  --disable-mipsdsp --disable-mipsdspr2 --disable-mipsfpu --disable-altivec
  --disable-vsx --disable-power8 --disable-amd3dnow --disable-amd3dnowext
  --disable-aesni --disable-avx --disable-avx2 --disable-fma3
  --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4
  --disable-sse42 --disable-xop --cpu=host --disable-doc
  --disable-htmlpages --enable-manpages
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Codecs:
 D..... = Decoding supported
 .E.... = Encoding supported
 ..V... = Video codec
 ..A... = Audio codec
 ..S... = Subtitle codec
 ...I.. = Intra frame-only codec
 ....L. = Lossy compression
 .....S = Lossless compression
 -------
 [...]
 DEV.L. theora               Theora (encoders: libtheora )
 [...]
 DEAIL. opus                 Opus (Opus Interactive Audio Codec)
                             (decoders: opus libopus ) (encoders: opus libopus )
 [...]
$ 

FFmpeg 还可以从单独的文件中添加专辑封面(或任何视频轨道),而不是直接将原始专辑封面映射到输出。这是一个示例,说明如何将原始 MJPEG 专辑封面提取为单独的文件,然后将其传回并使用 -map 选项仅使用 MP3 中的音频轨道和 MJPEG 中的视频轨道(我删除了大多数命令的输出,因为它们基本相同):

$ ffmpeg -i '01 - State of Grace.mp3' -map 0:v -c:v copy '01 - State of Grace.jpg'
[...]
$ ffmpeg -i '01 - State of Grace.mp3' -i '01 - State of Grace.jpg' -map 0:a -map 1:v '01 - State of Grace.ogg'
[...]
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> flac (native))
Stream #1:0 -> #0:1 (mjpeg (native) -> theora (libtheora))
[...]

我还省略了音频和视频编解码器及其选项(我不建议这样做),因此 FFmpeg 使用 FLAC 作为默认音频编解码器,使用 Theora 作为 Ogg 容器的默认视频编解码器。

希望对您有所帮助!