使用 gstreamer 将原始音频转换为 ogg

Question

以下管道生成一个 3kb 的 .ogg 文件（我假设它只是一个空容器）：

gst-launch-1.0 --gst-debug=3 filesrc location=test.raw
 ! 'audio/x-raw, format=S16LE, channels=1, rate=32000'
 ! audioconvert
 ! vorbisenc
 ! oggmux
 ! filesink location=test.ogg

调试输出如下：

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
0:00:00.048490941   813 0x556bf3625000 FIXME               basesink gstbasesink.c:3077:gst_base_sink_default_event:<filesink0> stream-start event without group-id. Consider implementing group-id handling in the upstream elements
0:00:00.048541997   813 0x556bf3625000 WARN            audioencoder gstaudioencoder.c:985:gst_audio_encoder_finish_frame:<vorbisenc0> Can't copy metadata because input buffer disappeared
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
0:00:00.139954729   813 0x556bf3625000 WARN                 basesrc gstbasesrc.c:2400:gst_base_src_update_length:<filesrc0> processing at or past EOS
Got EOS from element "pipeline0".
Execution ended after 0:00:00.091883401
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

当我添加这个 wav encode/decode 时，我得到了一个好的 .ogg 文件：

gst-launch-1.0 --gst-debug=3 filesrc location=test.raw
 ! 'audio/x-raw, format=S16LE, channels=1, rate=32000'
 ! audioconvert
 ! wavenc
 ! wavparse
 ! audioconvert
 ! vorbisenc
 ! oggmux
 ! filesink location=test.ogg

调试输出：

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
0:00:00.135676651   822 0x562b3cd64770 FIXME               basesink gstbasesink.c:3077:gst_base_sink_default_event:<filesink0> stream-start event without group-id. Consider implementing group-id handling in the upstream elements
0:00:00.135718946   822 0x562b3cd64770 WARN            audioencoder gstaudioencoder.c:985:gst_audio_encoder_finish_frame:<vorbisenc0> Can't copy metadata because input buffer disappeared
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
0:00:00.219188746   822 0x562b3cd64770 WARN                  wavenc gstwavenc.c:795:gst_wavenc_write_toc:<wavenc0> have no toc
Got EOS from element "pipeline0".
Execution ended after 0:00:00.083921991
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

所以我的问题是：第二个管道是 wavenc ！ wavparse，前提是缺少第一个并且是否有更直接的方法来指定它，或者第二种形式实际上是 'right' 的方法吗？

Answer 1

第一个管道很好，因为它与 testaudiosrc (audio/x-raw-int) 一起工作我假设您的未压缩音频文件必须是未压缩的 WAV 文件。

https://en.wikipedia.org/wiki/List_of_codecs#Audio_compression_formats

Wavenc 可能正在预处理 LPCM 并转换为 vorbisenc 可以使用的东西。我怀疑 vorbisenc 的数据宽度需要为 32 或 64，这可能是 showstopper。

PCM 签名 16 位 little-endian (S16LE) >
audioconvert - 将音频转换为不同的格式 (in:audio/x-raw-int out:audio/x-raw-int)
wavenc - 将原始音频编码为 WAV（输入：audio/x-raw-int 输出：audio/x-wav）
wavparse - 将 .wav 文件解析为原始音频 (in:audio/x-wav out:audio/x-raw-float width: { 32, 64 })
vorbisenc - 以 Vorbis 格式编码音频 (in:audio/x-raw-float out:audio/x-vorbis)

gst-launch audiotestsrc num-buffers=50 \
! vorbisenc \
! oggmux \
! filesink location=test.ogg

play test.ogg

附录：我下载了您的文件并确认您正在进行从 16 位到 32 位的未实现的流转换。 Vorbisenc 只接受 32 位宽度。要回答您原来的问题，不，您不需要 wavparsing。这是您正在寻找的高效管道，已针对宽度转换进行了简化。

gst-launch --gst-debug=2 filesrc location=test.raw \
! audio/x-raw-int, width=16, channels=2, depth=16, rate=16000, endianness=1234, signed=true \
! audioconvert \
! audio/x-raw-float, width=32, channels=2, rate=16000, endianness=1234, signed=true \
! vorbisenc \
! oggmux \
! filesink location=test.ogg

使用 gstreamer 将原始音频转换为 ogg

Converting raw audio to ogg with gstreamer

audio

gstreamer