使用 PySoX 在文件转换上设置 "sample rate" 属性?

Set "sample rate" attribute on a file Transform using PySoX?

我正在使用 PySoX 转换为音频文件:

import pysox
tfm = sox.Transformer()
tfm.build('./abc/1.raw', './abc/2.flac')

这是我遇到的错误: “sox.core.SoxError:标准输出: Stderr:sox FAIL 格式:文件“./abc/1.raw”的错误输入格式:未指定采样率“

如何构建包含采样率并完成转换的函数?

原因是原始音频文件不包含有关文件中音频格式的信息,因此您需要提供这些信息。采样率只是此类指标之一,因此您还需要对其他一些参数执行此操作。

引自sox.sourceforge.net:

SoX can work with ‘self-describing’ and ‘raw’ audio files. ‘self-describing’ formats (e.g. WAV, FLAC, MP3) have a header that completely describes the signal and encoding attributes of the audio data that follows. ‘raw’ or ‘headerless’ formats do not contain this information, so the audio characteristics of these must be described on the SoX command line or inferred from those of the input file.

The following four characteristics are used to describe the format of audio data such that it can be processed with SoX:

  • sample rate

    The sample rate in samples per second (‘Hertz’ or ‘Hz’). Digital telephony traditionally uses a sample rate of 8000 Hz (8 kHz), though these days, 16 and even 32 kHz are becoming more common. Audio Compact Discs use 44100 Hz (44.1 kHz). Digital Audio Tape and many computer systems use 48 kHz. Professional audio systems often use 96 kHz.

  • sample size [...]

  • data encoding [...]
  • channels [...]

pysox documentation描述了set_input_format方法:

set_input_format(file_type=None, rate=None, bits=None, channels=None, encoding=None, ignore_length=False)

Sets input file format arguments. This is primarily useful when dealing with audio files without a file extension. Overwrites any previously set input file arguments.

If this function is not explicitly called the input format is inferred from the file extension or the file’s header.

Parameters:

  • file_type : str or None, default=None

    The file type of the input audio file. Should be the same as what the file extension would be, for ex. ‘mp3’ or ‘wav’.

  • rate : float or None, default=None

    The sample rate of the input audio file. If None the sample rate is inferred.

  • [...]

因此,您应该按如下方式设置费率:

tfm.set_input_format(file_type='raw', rate=8000, bits=16, channels=1, encoding='signed-integer')

您必须将这些值调整为您在该原始文件中实际编码的值。此方法调用将应用于所有扩展名为 "raw" 的文件,因此如果您要处理多个此类文件,则无需再次调用上述方法。只有当不同 "raw" 文件中的特征不同时,您才需要使用适当的值再次调用它。