如何使用 FFmpeg 将音频转换为 WAVE_FORMAT_PCM?
How can I convert audio to WAVE_FORMAT_PCM using FFmpeg?
我正在使用 Python 的 wave
模块读取音频,并使用 FFmpeg 将音频从其他类型转换为 wav。但是,我遇到了一些问题。
我写了v.py
来生成静音音频文件a.wav
import sys, wave, math
import numpy as np
wave_data = np.zeros(44100).astype(np.short)
f = wave.open('a.wav', 'wb')
f.setnchannels(1)
f.setsampwidth(2)
f.setframerate(96000)
f.writeframes(wave_data.tostring())
f.close()
然后我用FFmpeg到"copy" a.wav
到b.wav
(虽然它似乎编码/解码文件),但我只能读取a.wav
Python; b.wav
无法打开。
[user@localhost tmp]$ ffmpeg -i a.wav b.wav
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'a.wav':
Duration: 00:00:00.46, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 96000 Hz, mono, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'b.wav':
Metadata:
ISFT : Lavf57.71.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 96000 Hz, mono, s16, 1536 kb/s
Metadata:
encoder : Lavc57.89.100 pcm_s16le
size= 86kB time=00:00:00.45 bitrate=1537.8kbits/s speed= 706x
video:0kB audio:86kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.115646%
[user@localhost tmp]$ python3
Python 3.6.4 (default, Jan 23 2018, 22:25:37)
[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import wave
>>> wave.open('a.wav')
<wave.Wave_read object at 0x7efea1c5e550>
>>> wave.open('b.wav')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.6/wave.py", line 499, in open
return Wave_read(f)
File "/usr/lib64/python3.6/wave.py", line 163, in __init__
self.initfp(f)
File "/usr/lib64/python3.6/wave.py", line 143, in initfp
self._read_fmt_chunk(chunk)
File "/usr/lib64/python3.6/wave.py", line 260, in _read_fmt_chunk
raise Error('unknown format: %r' % (wFormatTag,))
wave.Error: unknown format: 65534
>>>
我应该如何更改FFmpeg的命令将文件转换为WAVE_FORMAT_PCM,以便我可以用Python读取b.wav
?
issue是Python的wave模块不支持导入采样率大于48kHz的文件。 MP3 中介路由之所以有效,是因为 ffmpeg 在这种情况下会自动将输入采样率降低到 48 kHz。据报道,scipy 可以导入 48+ kHz 文件。
使用 ffmpeg 手动下采样到 48 kHz 的语法是
ffmpeg -i in -ar 48000 out.wav
P.S。要跳过 decoding/encoding,请使用 ffmpeg -i in.wav -c copy out.wav
.
我正在使用 Python 的 wave
模块读取音频,并使用 FFmpeg 将音频从其他类型转换为 wav。但是,我遇到了一些问题。
我写了v.py
来生成静音音频文件a.wav
import sys, wave, math
import numpy as np
wave_data = np.zeros(44100).astype(np.short)
f = wave.open('a.wav', 'wb')
f.setnchannels(1)
f.setsampwidth(2)
f.setframerate(96000)
f.writeframes(wave_data.tostring())
f.close()
然后我用FFmpeg到"copy" a.wav
到b.wav
(虽然它似乎编码/解码文件),但我只能读取a.wav
Python; b.wav
无法打开。
[user@localhost tmp]$ ffmpeg -i a.wav b.wav
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'a.wav':
Duration: 00:00:00.46, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 96000 Hz, mono, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'b.wav':
Metadata:
ISFT : Lavf57.71.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 96000 Hz, mono, s16, 1536 kb/s
Metadata:
encoder : Lavc57.89.100 pcm_s16le
size= 86kB time=00:00:00.45 bitrate=1537.8kbits/s speed= 706x
video:0kB audio:86kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.115646%
[user@localhost tmp]$ python3
Python 3.6.4 (default, Jan 23 2018, 22:25:37)
[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import wave
>>> wave.open('a.wav')
<wave.Wave_read object at 0x7efea1c5e550>
>>> wave.open('b.wav')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.6/wave.py", line 499, in open
return Wave_read(f)
File "/usr/lib64/python3.6/wave.py", line 163, in __init__
self.initfp(f)
File "/usr/lib64/python3.6/wave.py", line 143, in initfp
self._read_fmt_chunk(chunk)
File "/usr/lib64/python3.6/wave.py", line 260, in _read_fmt_chunk
raise Error('unknown format: %r' % (wFormatTag,))
wave.Error: unknown format: 65534
>>>
我应该如何更改FFmpeg的命令将文件转换为WAVE_FORMAT_PCM,以便我可以用Python读取b.wav
?
issue是Python的wave模块不支持导入采样率大于48kHz的文件。 MP3 中介路由之所以有效,是因为 ffmpeg 在这种情况下会自动将输入采样率降低到 48 kHz。据报道,scipy 可以导入 48+ kHz 文件。
使用 ffmpeg 手动下采样到 48 kHz 的语法是
ffmpeg -i in -ar 48000 out.wav
P.S。要跳过 decoding/encoding,请使用 ffmpeg -i in.wav -c copy out.wav
.