Pocketsphinx in python returns 关键字搜索中的随机词

Pocketsphinx in python returns random words in keyword search

我从网站上复制了一段代码,以使用 pocketsphinx.It 收听 python 中的特定单词,虽然运行但从未输出关键字,因为 expected.This 是我的代码:

import sys, os
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *
import pyaudio

# modeldir = "../../../model"
# datadir = "../../../test/data"

modeldir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us"
dictdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//cmudict-en-us.dict"
lmdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us.lm.bin"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', modeldir)
config.set_string('-lm', lmdir )
config.set_string('-dict', dictdir)
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
      #print(decoder.hyp().hypstr)
      if decoder.hyp().hypstr == 'forward':
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyword, restarting search")
        decoder.end_utt()
        decoder.start_utt()

还有当我使用print(decoder.hyp().hypstr)

它只是在我说话时输出随机单词 anything.For 例如,如果我说一个单词或一行它输出:

the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the da
the head
the bed
the bedding
the heading of
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and well
the bedding and well
the bedding and well
the bedding and butler
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus ha
the bedding and blessed are
the bedding and blessed are
the bedding and what lingus on
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want or
the bedding and what lingus want to talk
the bedding and what lingus current top
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her

请帮助我完成 it.I 我只是 python 的新手。

首先,我想澄清一下;你的 Pocketsphinx 正在 工作。

所以,根据我使用 pocketsphinx, it is hardly the most accurate voice recognition tool you can use, but probably your best bet for an Offline solution. Pocketsphinx can only translate your words (audio) as best as its' model prescribes. These models seem to still be a work in progress and much of it needs to be improved. There are a few things you can do to try increasing the accuracy of the recognition; such as reducing noise, and tuning the recognition 的经验,但这超出了这个问题的直接范围。

根据我对您的代码的理解,您正在寻找要说出的特定关键字(由用户口头表达)并使用 pocketshinx 的后端识别它。这个关键字好像是"forward"。您可以进一步阅读如何正确完成 "hot word listening".

你的想法是对的,但是方法有待改进。这是我的 "quick fix" 版本的代码:

import os
import pyaudio
import pocketsphinx as ps

modeldir = "C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//"

# Create a decoder with certain model
config = ps.Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us'))
config.set_string('-lm', os.path.join(modeldir, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(modeldir, 'cmudict-en-us.dict'))
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = ps.Decoder(config)
decoder.start_utt()

while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
    if decoder.hyp() is not None:
        print(decoder.hyp().hypstr)
        if 'forward' in decoder.hyp().hypstr:
            print([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
            print("Detected keyword, restarting search")
            decoder.end_utt()
            decoder.start_utt()

对于任何一个 pocketsphinx.Decoder() "session"(即调用 .start_utt() 方法,随后不调用 .ent_utt()),decoder.hyp().hypstr 变量将有效地继续一旦它检测到输入音频流具有来自 pocketsphinx 解码的 "valid" translation/recognition,就向自身添加单词。

您已使用 if decoder.hyp().hypstr == 'forward':。这样做的目的是,它强制整个字符串正好是 "forward",以便代码输入那个(我想,需要……是吗?)条件代码块。由于默认情况下 pocketshinx 不是很准确,因此通常需要对大多数单词进行几次尝试才能真正注册正确的单词。出于这个原因,并且由于 decoder.hyp().hypstr 添加到自身(如前所述),我使用了行 if 'forward' in decoder.hyp().hypstr:。这会在整个字符串中查找所需的关键字 "forward"。这样,它允许错误识别,直到找到关键字。

希望对您有所帮助!

您需要删除此行

  config.set_string('-lm', lmdir )

关键词搜索和 lm 搜索是互斥的。