Error: Mismatch in sampling rate: Expected: 16000; Actual: 48000. Tensorflow js throwing error

Error: Mismatch in sampling rate: Expected: 16000; Actual: 48000. Tensorflow js throwing error

我想录制 16000Hz 的音频并获取它的频谱图。我的模型接受 [null.1998.101] 的输入。我无法在 javascript

中实现
    const mic = await tf.data.microphone({
    fftSize: 256,
    columnTruncateLength: 101,
    numFramesPerSpectrogram: 1998 ,
    sampleRateHz:16000,
    includeSpectrogram: true,
    includeWaveform: true
});

const audioData = await mic.capture();
console.log(audioData)
const spectrogramTensor = audioData.spectrogram;
console.log(spectrogramTensor)
spectrogramTensor.print();
const waveformTensor = audioData.waveform;
waveformTensor.print();
mic.stop(); 

我的模型是触发词检测。 在 Python 中,我使用了以下代码。

def graph_spectrogram(wav_file):
rate, data = get_wav_info(wav_file)
print(data)
print(len(data))
nfft = 200 # Length of each window segment
fs = 8000 # Sampling frequencies
noverlap = 120 # Overlap between windows
nchannels = data.ndim
if nchannels == 1:
    pxx, freqs, bins, im = plt.specgram(data, nfft, fs, noverlap = noverlap)
elif nchannels == 2:
    pxx, freqs, bins, im = plt.specgram(data[:,0], nfft, fs, noverlap = noverlap)
return pxx

浏览器对录音的采样率有一个默认的固定值。下面会输出浏览器的frequency速率

new window.AudioContext().sampleRate

由于16000与浏览器采样率不匹配而抛出错误。目前无法从浏览器更改录音的采样率。 能做的是

  • 使用频率训练模型
  • 将张量重塑(或切片)到模型 inputShape
  • 录制音频并重新采样(使用此 answer) and create a tensor from an audio recording (using this answer
  • 虽然我没试过,但好像采样率的值来自于操作系统的设置。更改它将允许录音具有正确的采样率。在linux上,可以在/etc/pulse/daemon.conf文件中设置录制频率