使用 gstreamer 将原始音频转换为 ogg

Converting raw audio to ogg with gstreamer

以下管道生成一个 3kb 的 .ogg 文件(我假设它只是一个空容器):

gst-launch-1.0 --gst-debug=3 filesrc location=test.raw
 ! 'audio/x-raw, format=S16LE, channels=1, rate=32000'
 ! audioconvert
 ! vorbisenc
 ! oggmux
 ! filesink location=test.ogg

调试输出如下:

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
0:00:00.048490941   813 0x556bf3625000 FIXME               basesink gstbasesink.c:3077:gst_base_sink_default_event:<filesink0> stream-start event without group-id. Consider implementing group-id handling in the upstream elements
0:00:00.048541997   813 0x556bf3625000 WARN            audioencoder gstaudioencoder.c:985:gst_audio_encoder_finish_frame:<vorbisenc0> Can't copy metadata because input buffer disappeared
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
0:00:00.139954729   813 0x556bf3625000 WARN                 basesrc gstbasesrc.c:2400:gst_base_src_update_length:<filesrc0> processing at or past EOS
Got EOS from element "pipeline0".
Execution ended after 0:00:00.091883401
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

当我添加这个 wav encode/decode 时,我得到了一个好的 .ogg 文件:

gst-launch-1.0 --gst-debug=3 filesrc location=test.raw
 ! 'audio/x-raw, format=S16LE, channels=1, rate=32000'
 ! audioconvert
 ! wavenc
 ! wavparse
 ! audioconvert
 ! vorbisenc
 ! oggmux
 ! filesink location=test.ogg

调试输出:

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
0:00:00.135676651   822 0x562b3cd64770 FIXME               basesink gstbasesink.c:3077:gst_base_sink_default_event:<filesink0> stream-start event without group-id. Consider implementing group-id handling in the upstream elements
0:00:00.135718946   822 0x562b3cd64770 WARN            audioencoder gstaudioencoder.c:985:gst_audio_encoder_finish_frame:<vorbisenc0> Can't copy metadata because input buffer disappeared
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
0:00:00.219188746   822 0x562b3cd64770 WARN                  wavenc gstwavenc.c:795:gst_wavenc_write_toc:<wavenc0> have no toc
Got EOS from element "pipeline0".
Execution ended after 0:00:00.083921991
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

所以我的问题是:第二个管道是 wavenc ! wavparse,前提是缺少第一个并且是否有更直接的方法来指定它,或者第二种形式实际上是 'right' 的方法吗?

第一个管道很好,因为它与 testaudiosrc (audio/x-raw-int) 一起工作 我假设您的未压缩音频文件必须是未压缩的 WAV 文件。

https://en.wikipedia.org/wiki/List_of_codecs#Audio_compression_formats

Wavenc 可能正在预处理 LPCM 并转换为 vorbisenc 可以使用的东西。我怀疑 vorbisenc 的数据宽度需要为 32 或 64,这可能是 showstopper。

PCM 签名 16 位 little-endian (S16LE) >
audioconvert - 将音频转换为不同的格式 (in:audio/x-raw-int out:audio/x-raw-int)
wavenc - 将原始音频编码为 WAV(输入:audio/x-raw-int 输出:audio/x-wav)
wavparse - 将 .wav 文件解析为原始音频 (in:audio/x-wav out:audio/x-raw-float width: { 32, 64 })
vorbisenc - 以 Vorbis 格式编码音频 (in:audio/x-raw-float out:audio/x-vorbis)

gst-launch audiotestsrc num-buffers=50 \
! vorbisenc \
! oggmux \
! filesink location=test.ogg

play test.ogg

附录:我下载了您的文件并确认您正在进行从 16 位到 32 位的未实现的流转换。 Vorbisenc 只接受 32 位宽度。要回答您原来的问题,不,您不需要 wavparsing。这是您正在寻找的高效管道,已针对宽度转换进行了简化。

gst-launch --gst-debug=2 filesrc location=test.raw \
! audio/x-raw-int, width=16, channels=2, depth=16, rate=16000, endianness=1234, signed=true \
! audioconvert \
! audio/x-raw-float, width=32, channels=2, rate=16000, endianness=1234, signed=true \
! vorbisenc \
! oggmux \
! filesink location=test.ogg