如果 .wav 文件持续时间很长,Sphinx4 将无法识别完整语音
Sphinx4 won't recognize full speech if .wav file duration is long
我们正在做一个项目,将用户的答案保存为 .wav 文件并在之后进行评估。我们为每个问题创建了语法。有两个问题我们遇到了识别问题。问题可能是相同的,因为用户必须为这两个问题说大约 7-8 秒。
这是我们用于其中一个问题的语法文件;
#JSGF V1.0;grammar Question8; public <Question8> = ( one hundred | ninety three | eighty six | seventy nine | seventy two | sixty five) * ;
在这里,用户必须向后数 7 秒。如果我说得太快,它会很好地识别。当我说慢点的时候,比如说完"one hundred"等1秒,这样继续说到65,它就只能认出一百个,其他的就认不出来了。
两个主要部分负责这些过程:
我们为麦克风创建的class;
public final class SpeechRecorder {
static Configuration configuration = new Configuration();
static Microphone mic = new Microphone(16000, 16, 1, true, true, false, 10, true, "average", 0, "default", 6400);
public static void startMic() {
mic.initialize();
mic.startRecording();
mic.getAudioFormat();
mic.getUtterance();
System.out.println("Audio Format is" + mic.getAudioFormat());
}
public static void stopMic(String questionName) {
mic.stopRecording();
Utterance u = mic.getUtterance();
try {
u.save("Resources/Answers/" + questionName + ".wav", AudioFileFormat.Type.WAVE);
} catch (IOException e) {
e.printStackTrace();
}
}
public static String getAnswersOfSpeech(String question) throws IOException {
Evaluation.disableLogMessages();
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
configuration.setGrammarPath("resource:/Grammer");
configuration.setGrammarName(question);
configuration.setUseGrammar(true);
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
recognizer.startRecognition(new FileInputStream("Resources/Answers/" + question + ".wav"));
SpeechResult Result = recognizer.getResult();
String speechWords = Result.getHypothesis();
return speechWords;
}
public static String getSavedAnswer(int question) {
return User.getAnswers(question);
}
}
这是我们将用户的答案作为 .wav 文件保存到我们的资源中的地方。
btn_microphone.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
click++;
if (click % 2 == 1) {
SpeechRecorder.startMic();
btn_microphone.setIcon(new ImageIcon("Resources/Images/record.png"));
} else {
SpeechRecorder.stopMic("Question" + Integer.toString(question));
btn_Next.setVisible(true);
btn_microphone.setIcon(new ImageIcon("Resources/Images/microphone.png"));
lbl_speechAnswer.setVisible(true);
try {
userAnswer = SpeechRecorder.getAnswersOfSpeech("Question" + Integer.toString(question));
} catch (IOException e1) {
e1.printStackTrace();
}
if (userAnswer.equals("")) {
lbl_speechAnswer.setText(
"<html>No answer was given, click on microphone button to record again</html>");
} else {
lbl_speechAnswer.setText("<html>Your answer is " + userAnswer
+ ", click on microphone button to record again</html>");
}
}
}
});
我不知道我们怎样才能克服这个问题。如果有人能帮助我,我将不胜感激。
您需要一个循环,如 transcriber demo:
while ((result = recognizer.getResult()) != null) {
System.out.format("Hypothesis: %s\n", result.getHypothesis());
}
recognizer.stopRecognition();
我们正在做一个项目,将用户的答案保存为 .wav 文件并在之后进行评估。我们为每个问题创建了语法。有两个问题我们遇到了识别问题。问题可能是相同的,因为用户必须为这两个问题说大约 7-8 秒。
这是我们用于其中一个问题的语法文件;
#JSGF V1.0;grammar Question8; public <Question8> = ( one hundred | ninety three | eighty six | seventy nine | seventy two | sixty five) * ;
在这里,用户必须向后数 7 秒。如果我说得太快,它会很好地识别。当我说慢点的时候,比如说完"one hundred"等1秒,这样继续说到65,它就只能认出一百个,其他的就认不出来了。
两个主要部分负责这些过程:
我们为麦克风创建的class;
public final class SpeechRecorder {
static Configuration configuration = new Configuration();
static Microphone mic = new Microphone(16000, 16, 1, true, true, false, 10, true, "average", 0, "default", 6400);
public static void startMic() {
mic.initialize();
mic.startRecording();
mic.getAudioFormat();
mic.getUtterance();
System.out.println("Audio Format is" + mic.getAudioFormat());
}
public static void stopMic(String questionName) {
mic.stopRecording();
Utterance u = mic.getUtterance();
try {
u.save("Resources/Answers/" + questionName + ".wav", AudioFileFormat.Type.WAVE);
} catch (IOException e) {
e.printStackTrace();
}
}
public static String getAnswersOfSpeech(String question) throws IOException {
Evaluation.disableLogMessages();
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
configuration.setGrammarPath("resource:/Grammer");
configuration.setGrammarName(question);
configuration.setUseGrammar(true);
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
recognizer.startRecognition(new FileInputStream("Resources/Answers/" + question + ".wav"));
SpeechResult Result = recognizer.getResult();
String speechWords = Result.getHypothesis();
return speechWords;
}
public static String getSavedAnswer(int question) {
return User.getAnswers(question);
}
}
这是我们将用户的答案作为 .wav 文件保存到我们的资源中的地方。
btn_microphone.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
click++;
if (click % 2 == 1) {
SpeechRecorder.startMic();
btn_microphone.setIcon(new ImageIcon("Resources/Images/record.png"));
} else {
SpeechRecorder.stopMic("Question" + Integer.toString(question));
btn_Next.setVisible(true);
btn_microphone.setIcon(new ImageIcon("Resources/Images/microphone.png"));
lbl_speechAnswer.setVisible(true);
try {
userAnswer = SpeechRecorder.getAnswersOfSpeech("Question" + Integer.toString(question));
} catch (IOException e1) {
e1.printStackTrace();
}
if (userAnswer.equals("")) {
lbl_speechAnswer.setText(
"<html>No answer was given, click on microphone button to record again</html>");
} else {
lbl_speechAnswer.setText("<html>Your answer is " + userAnswer
+ ", click on microphone button to record again</html>");
}
}
}
});
我不知道我们怎样才能克服这个问题。如果有人能帮助我,我将不胜感激。
您需要一个循环,如 transcriber demo:
while ((result = recognizer.getResult()) != null) {
System.out.format("Hypothesis: %s\n", result.getHypothesis());
}
recognizer.stopRecognition();