如何检测 CMU Sphinx 中超出词汇表的单词

How to detect out of vocabulary word in CMU Sphinx

Java 的 Sphinx 语音识别库出现问题。我正在使用它来获取输入并处理它。在语法文件中,我这样写:

#JSGF V1.0;

grammar hello;

public <sentence> = (play | pause | next | previous);

我的语法很简单,只有 4 个单词:"play"、"pause"、"next"、"previous"。我已经使用 Sphinx 成功检测到它们。但是我希望我的应用程序在我说一些不属于语法的单词时显示如下消息:"Unrecognized word"。目前,例如,如果我对着麦克风说话 a 不属于像 :"stop" 这样的语法,它仍然会显示它检测到的词是最近的结果。

我的代码是这样的:

public class SphinxDemo {

    static int i = 1;
    static String resultText;

    public static void main(String[] args) {
        try {
            URL url;
            if (args.length > 0) {
                url = new File(args[0]).toURI().toURL();
            } else {
                url = SphinxDemo.class.getResource("helloworld.config.xml");
            }

            System.out.println("Loading...");

            ConfigurationManager cm = new ConfigurationManager(url);

            Recognizer recognizer = (Recognizer) cm.lookup("recognizer");
            Microphone microphone = (Microphone) cm.lookup("microphone");

            /* allocate the resource necessary for the recognizer */
            recognizer.allocate();

            /* the microphone will keep recording until the program exits */

            if (microphone.startRecording()) {
                System.out
                        .println("Say: play|pause|previous|next");

                while (true) {

                    System.out
                            .println("Start speaking. Press Ctrl-C to quit.\n");

                    Result result = recognizer.recognize();
                    if (result != null) {

                        System.out.println("Enter your choise" + "\n");
                        resultText = result.getBestFinalResultNoFiller();
                        System.out.println("You said: " + resultText + "\n");
                    }

                    if(!(resultText.equalsIgnoreCase("play") || resultText.equalsIgnoreCase("previous") || resultText.equalsIgnoreCase("pause")||resultText.equalsIgnoreCase("next"))){
                        System.out.println("Unrecognized word\n");
                    }

                }
            } else {
                System.out.println("Cannot start microphone.");
                recognizer.deallocate();
                System.exit(1);
            }

        } catch (IOException e) {
            System.err.println("Problem when loading SphinxDemo: " + e);
            e.printStackTrace();
        } catch (PropertyException e) {
            System.err.println("Problem configuring SphinxDemo: " + e);
            e.printStackTrace();
        } catch (InstantiationException e) {
            System.err.println("Problem creating SphinxDemo: " + e);
            e.printStackTrace();
        }

    }
}

我尝试添加类似这样的内容来检测无法识别的单词,但它不起作用:

  if(!(resultText.equalsIgnoreCase("play") || resultText.equalsIgnoreCase("previous") || resultText.equalsIgnoreCase("pause")||resultText.equalsIgnoreCase("next"))){
                System.out.println("Unrecognized word\n");
 }

如果您使用最新的 cmusphinx,它会 return <unk> 当单词不在语法中时。