Google 语音识别无法识别某些单词/短语，例如 um 和 er | python

Question

所以看起来 google 语音识别正在删除我语音的某些部分，例如嗯、呃和啊。问题是我想让这些被识别，我似乎不知道如何启用它。

代码如下：

import pyttsx3

recognizer = speech_recognition.Recognizer()

vocal_imperfections = 0

vi_list = ['hmm', 'umm', 'aha', 'ahh', 'uh', 'um', 'er']

while True:
    try:
        with speech_recognition.Microphone() as mic:
            recognizer.adjust_for_ambient_noise(mic, duration=0.2)
            audio = recognizer.listen(mic)
            text = recognizer.recognize_google(audio, language='en-IN', show_all=True)
            #text = recognizer.recognize_ibm(audio)
            if text != []:
                text = text['alternative'][0]['transcript']
                if any(word in text for word in vi_list):
                    vocal_imperfections = vocal_imperfections+1
                print(text)
                print(vocal_imperfections)


    except speech_recognition.UnknownValueError():
        recognizer = speech_recognition.Recognizer()
        continue

它的工作原理只是 google 消除了声音的缺陷。有谁知道如何启用此功能，或可以识别语音缺陷的替代免费实时语音识别？

示例：如果我说：“嗯，我想今天是 30 号” Google 会 return：“我想今天是 30 号”

Answer 1

我查看了 Google Cloud Speech-to-text API docs，没有看到任何相关内容 （截至 2022 年 3 月）。我还遇到了这些相关资源：

Detecting filler words in speech-to-text
How can I detect filler words like "ah, um" using a speech-to-text API like Google Speech API? (Quora)
FillerWordShock - one person's research on this topic

所有证据表明无法使用 Google 云 Speech-to-text 服务（目前），您必须寻求替代服务。我不会重复资源中列出的备选方案，但提供了几个，您必须选择最适合您特定需求的一个。

此外，您可能已经知道这一点（如果您知道，我们深表歉意），但这些类型的词通常称为“填充”and/or“犹豫”词。这可能对您研究该主题有所帮助。

好消息是 SpeechRecognition 模块（我认为这就是您根据代码使用的模块）支持多种不同的引擎，因此希望其中之一提供填充词。

Google 语音识别无法识别某些单词/短语，例如 um 和 er | python

Google speech recognition not recognizing certain words / phrases like um and er | python

python

speech-recognition

google-speech-api