如何在普通语音数据集上训练 CNN

How to train CNN on common voice dataset

python
speech-recognition
conv-neural-network
keras
librosa

我正在尝试用 common voice dataset. I am new to speech recognition and am not able to find any links on how to use the dataset with keras. I followed this article 训练一个 cnn 来构建一个简单的单词分类网络。但我想用普通语音数据集扩大规模。感谢您的帮助。

谢谢

您可以做的是查看 MFCCs. In short, these are features extracted from the audio waveform by using signal processing techniques to transcribe the way humans perceive sound. In python, you can use python-speech-features 来计算 MFCC。

准备好数据后，就可以构建 CNN 了；例如 this one:

您也可以使用 RNN（例如 LSTM 或 GRU），但这更高级一些。

编辑：一个非常好的数据集，如果你想要的话：

Speech Commands Dataset

如何在普通语音数据集上训练 CNN

How to train CNN on common voice dataset

python

speech-recognition

conv-neural-network

keras

librosa