使用 Java SDK 的 IBM Watson 语音转文本仅返回第一个单词
IBM Watson Speech to Text Only Returning First Word With Java SDK
我正在使用 IBM Watson 语音转文本 Java SDK,当我上传 .wav 文件时,响应 JSON 仅包含第一个转录词。当我将相同的文件上传到网络演示时,我得到了完整的响应。
使用 SDK 实现非常简单:
SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");
File audio = new File("src/test/resources/sample1.wav");
SpeechResults transcript = service.recognize(audio, HttpMediaType.AUDIO_WAV);
System.out.println(transcript);
您正在使用的 recognize()
签名将在第一次暂停后 return。要查看所有结果,请执行以下操作:
RecognizeOptions options = new RecognizeOptions();
options = options.continuous(true)
.contentType(HttpMediaType.AUDIO_WAV)
.interimResults(false)
.inactivityTimeout(10)
.maxAlternatives(1)
.wordConfidence(false)
.timestamps(true)
.model("en-US_BroadbandModel");
SpeechResults transcript = service.recognize(audio, options);
这适用于我使用以下 Maven 依赖项:
<dependency>
<groupId>com.ibm.watson.developer_cloud</groupId>
<artifactId>java-sdk</artifactId>
<version>2.8.0</version>
</dependency>
我正在尝试完全相同的事情,使用 Chris K 描述的相同参数。("interim results = true" 除外)
FileInputStream audio = new FileInputStream("/home/leoks/BM/ws/mp32wav/out.wav");
RecognizeOptions options = new RecognizeOptions();
options = options.continuous(true)
.contentType(HttpMediaType.AUDIO_WAV)
.interimResults(true)
.inactivityTimeout(10)
.maxAlternatives(1)
.wordConfidence(false)
.timestamps(true)
.model("en-US_BroadbandModel");
api.stt.recognizeUsingWebSockets(audio, options, new BaseRecognizeDelegate() {
@Override
public void onMessage(SpeechResults speechResults) {
System.out.println(speechResults);
try{
if (speechResults != null && speechResults.isFinal()){
lock.countDown();
}
}catch(java.lang.IndexOutOfBoundsException ignored){
}
}
});
lock.await(5, TimeUnit.SECONDS);
但是,还是会抛出异常
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
这似乎与这个问题有关
https://github.com/watson-developer-cloud/java-sdk/issues/205
设置为在里程碑 2.9.0 中达到(当前为 2.8.0)
我正在使用 IBM Watson 语音转文本 Java SDK,当我上传 .wav 文件时,响应 JSON 仅包含第一个转录词。当我将相同的文件上传到网络演示时,我得到了完整的响应。
使用 SDK 实现非常简单:
SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");
File audio = new File("src/test/resources/sample1.wav");
SpeechResults transcript = service.recognize(audio, HttpMediaType.AUDIO_WAV);
System.out.println(transcript);
您正在使用的 recognize()
签名将在第一次暂停后 return。要查看所有结果,请执行以下操作:
RecognizeOptions options = new RecognizeOptions();
options = options.continuous(true)
.contentType(HttpMediaType.AUDIO_WAV)
.interimResults(false)
.inactivityTimeout(10)
.maxAlternatives(1)
.wordConfidence(false)
.timestamps(true)
.model("en-US_BroadbandModel");
SpeechResults transcript = service.recognize(audio, options);
这适用于我使用以下 Maven 依赖项:
<dependency>
<groupId>com.ibm.watson.developer_cloud</groupId>
<artifactId>java-sdk</artifactId>
<version>2.8.0</version>
</dependency>
我正在尝试完全相同的事情,使用 Chris K 描述的相同参数。("interim results = true" 除外)
FileInputStream audio = new FileInputStream("/home/leoks/BM/ws/mp32wav/out.wav");
RecognizeOptions options = new RecognizeOptions();
options = options.continuous(true)
.contentType(HttpMediaType.AUDIO_WAV)
.interimResults(true)
.inactivityTimeout(10)
.maxAlternatives(1)
.wordConfidence(false)
.timestamps(true)
.model("en-US_BroadbandModel");
api.stt.recognizeUsingWebSockets(audio, options, new BaseRecognizeDelegate() {
@Override
public void onMessage(SpeechResults speechResults) {
System.out.println(speechResults);
try{
if (speechResults != null && speechResults.isFinal()){
lock.countDown();
}
}catch(java.lang.IndexOutOfBoundsException ignored){
}
}
});
lock.await(5, TimeUnit.SECONDS);
但是,还是会抛出异常
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at com.google.gson.JsonParser.parse(JsonParser.java:45)
at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66)
at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
这似乎与这个问题有关
https://github.com/watson-developer-cloud/java-sdk/issues/205
设置为在里程碑 2.9.0 中达到(当前为 2.8.0)