使用 os.system() 转换音频文件采样率

Question

我已经开始从事 NLP 项目，在此开始时，我需要对音频文件进行下采样。为此，我找到了一个可以自动执行此操作的脚本，但尽管我可以使用它来降低音频采样率，但我还是很难理解它是如何工作的。

def convert_audio(audio_path, target_path, remove=False):
    """This function sets the audio `audio_path` to:
        - 16000Hz Sampling rate
        - one audio channel ( mono )
            Params:
                audio_path (str): the path of audio wav file you want to convert
                target_path (str): target path to save your new converted wav file
                remove (bool): whether to remove the old file after converting
        Note that this function requires ffmpeg installed in your system."""

    os.system(f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}")
    # os.system(f"ffmpeg -i {audio_path} -ac 1 {target_path}")
    if remove:
        os.remove(audio_path)

这是给我带来麻烦的代码，我不明白倒数第 4 行是如何工作的，我相信这是对音频文件重新采样的行。

其中的回购协议： https://github.com/x4nth055/pythoncode-tutorials/

如果有人知道这是如何完成的，我很想知道，或者是否有更好的方法来缩减音频文件！谢谢

Answer 1

你用过ffmpeg吗？文档清楚地显示了选项（可能需要音频专业知识才能理解）

-ac[:stream_specifier] channels (input/output,per-stream) Set the number of audio channels. For output streams it is set by default to the number of input audio channels. For input streams this option only makes sense for audio grabbing devices and raw demuxers and is mapped to the corresponding demuxer options.

-ar[:stream_specifier] freq (input/output,per-stream) Set the audio sampling frequency. For output streams it is set by default to the frequency of the corresponding input stream. For input streams this option only makes sense for audio grabbing devices and raw demuxers and is mapped to the corresponding demuxer options.

对 os.system

的解释

Execute the command (a string) in a subshell...on Windows, the return value is that returned by the system shell after running command. The shell is given by the Windows environment variable COMSPEC: it is usually cmd.exe, which returns the exit status of the command run; on systems using a non-native shell, consult your shell documentation.

为了更好的理解，建议打印命令

cmd_str = f"ffmpeg -i {audio_path} -ac 1 -ar 16000 {target_path}"
print(cmd_str) # then you can copy paste to cmd/bash and run
os.system(cmd_str)

使用 os.system() 转换音频文件采样率

Using os.system() to convert audio files sample rate

python

audio

operating-system

nlp