无法将语音识别集成到 PysimpleGUI + 搜索机器人中

Trouble Integrating Speech Recognition into PysimpleGUI + Search Bot

我最近尝试了一个使用 TTS 的简单搜索引擎。但是,我尝试将语音集成到 Pysimple GUI 中搜索字段的文本,但遇到了障碍。

我能够让 pysimpleGUI 识别我说的话,并搜索 results.However,搜索引擎机制不会将我的语音转文本识别为我输入的值,并且只将我在演讲中使用的第一个字母还给我。

例如,我说“今天天气怎么样”,它返回了字母“W”的定义

这些是我使用的包:

import wolframalpha
import wikipedia
import speech_recognition as sr
r = sr.Recognizer()
m = sr.Microphone()
import PySimpleGUI as sg
import pyttsx3

Pysimple GUI 文本中的所有内容 window。

layout = [  [sg.Text('Welcome Back Sir')],
            [sg.Text('How can I be of assistance'), sg.InputText()],
            [sg.ReadButton('Speak'), sg.Button('Ok'), sg.Button('Cancel')]]

window = sg.Window('Pybot', layout)
engine = pyttsx3.init()

事件循环处理“事件”并获取输入的“值”

while True:
    event, values = window.read()
    if event in (None, 'Cancel'):
        break
    if event == 'Speak':
        with m as source:
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
            values = r.recognize_google(audio, language='en-US')
            print(values)
    try:
        wiki_res = wikipedia.summary(values[0], sentences=2)
        wolfram_res = next(client.query(values[0]).results).text
        engine.say(wolfram_res)
        sg.PopupNonBlocking("Wolfram Result: "+wolfram_res,"Wikipedia Result: "+wiki_res)
    except wikipedia.exceptions.DisambiguationError:
        wolfram_res = next(client.query(values[0]).results).text
        engine.say(wolfram_res)
        sg.PopupNonBlocking(wolfram_res)
    except wikipedia.exceptions.PageError:
        wolfram_res = next(client.query(values[0]).results).text
        engine.say(wolfram_res)
        sg.PopupNonBlocking(wolfram_res)
    except:
        wiki_res = wikipedia.summary(values[0], sentences=2)
        engine.say(wiki_res)
        sg.PopupNonBlocking(wiki_res)

    engine.runAndWait()

    print (values[0])

window.close()

您使用相同的变量 values,所以 values[0]w


while True:
    event, values = window.read()
...
            values = r.recognize_google(audio, language='en-US')
...

修改后的代码

while True:
    event, values = window.read()
    if event in (None, 'Cancel'):
        break
    # print(event, values)
    if event == 'Speak':
        with m as source:
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
            value = r.recognize_google(audio, language='en-US')
            print(value)
            window[0].update(value)
            window.write_event_value('Ok', '')
    elif event == 'Ok':
        if values[0] == '':
            continue
        try:
            wiki_res = wikipedia.summary(values[0], sentences=2)
            wolfram_res = next(client.query(values[0]).results).text
            engine.say(wolfram_res)
            sg.PopupNonBlocking("Wolfram Result: "+wolfram_res,"Wikipedia Result: "+wiki_res)
        except wikipedia.exceptions.DisambiguationError:
            wolfram_res = next(client.query(values[0]).results).text
            engine.say(wolfram_res)
            sg.PopupNonBlocking(wolfram_res)
        except wikipedia.exceptions.PageError:
            wolfram_res = next(client.query(values[0]).results).text
            engine.say(wolfram_res)
            sg.PopupNonBlocking(wolfram_res)
        except:
            wiki_res = wikipedia.summary(values[0], sentences=2)
            engine.say(wiki_res)
            sg.PopupNonBlocking(wiki_res)

        engine.runAndWait()

window.close()

处理音频的时候会阻塞,或许你可以尝试多线程编程,不然GUI有时会无响应。