Watson STT Java - Websockets Java 和 HTTP POST 之间的不同结果

Watson STT Java - Varying results between Websockets Java and HTTP POST

我正在尝试构建一个采用流式音频输入(例如:麦克风中的线路)并使用 IBM Bluemix (Watson) 进行语音转文本的应用程序。

我简要修改了示例 Java 找到的代码 here。这个例子发送的是 WAV,但我发送的是 FLAC...这 [应该] 无关紧要。

结果很糟糕,很糟糕。这是我在使用 Java Websockets 代码时得到的:

{
  "result_index": 0,
  "results": [
    {
      "final": true,
      "alternatives": [
        {
          "transcript": "it was six weeks ago today the terror ",
          "confidence": 0.92
        }
      ]
    }
  ]
}

现在,将上面的结果与下面的结果进行比较。这些是发送相同内容但使用 cURL (HTTP POST) 时的结果:

{
   "results": [
  {
     "alternatives": [
        {
           "confidence": 0.945,
           "transcript": "it was six weeks ago today the terrorists attacked the U. S. consulate in Benghazi Libya now we've obtained email alerts that were put out by the state department as the attack unfolded as you know four Americans were killed including ambassador Christopher Stevens "
        }
     ],
     "final": true
  },
  {
     "alternatives": [
        {
           "confidence": 0.942,
           "transcript": "sharyl Attkisson has our story "
        }
     ],
     "final": true
  }
   ],
   "result_index": 0
}

这几乎是完美的结果。

为什么使用 Websockets 时会有所不同?

此问题已在 3.0.0-RC1 版本中修复。

您可以从以下位置获取新的 jar:

  1. Maven

    <dependency>
        <groupId>com.ibm.watson.developer_cloud</groupId>
        <artifactId>java-sdk</artifactId>
        <version>3.0.0-RC1</version>
    </dependency>
    
  2. Gradle

    'com.ibm.watson.developer_cloud:java-sdk:3.0.0-RC1'
    
  3. JAR

    下载 jar-with-dependencies(~1.4MB)


下面是如何使用 WebSockets 识别 flac 音频文件的示例

SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");

FileInputStream audio = new FileInputStream("path-to-audio-file.flac");

RecognizeOptions options = new RecognizeOptions.Builder()
  .continuous(true)
  .interimResults(true)
  .contentType(HttpMediaType.AUDIO_FLAC)
  .build();

service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() {
  @Override
  public void onTranscription(SpeechResults speechResults) {
    System.out.println(speechResults);
  }
});

}

要测试的 FLAC 文件:https://s3.amazonaws.com/mozart-company/tmp/4.flac


注意: 3.0.0-RC1 候选版本 。我们将在下周发布正式版 (3.0.1)。