Azure 认知语音服务 STT - 部分文本
Azure Cognitive Speech Services STT - Partial Text
在我的代码(下方)中,当我通过 STT 处理它时,它只给我整个音频的第一个 alphabet/word。
音频有"A B C D E F"
我错过了什么?
Imports Microsoft.CognitiveServices.Speech
Imports Microsoft.CognitiveServices.Speech.SpeechConfig
Imports Microsoft.CognitiveServices.Speech.Audio
Module Module1
Sub Main()
Dim SpeechConfig As SpeechConfig = FromSubscription("<CHANGED>", "eastus")
Dim audioConfig As Audio.AudioConfig = Audio.AudioConfig.FromWavFileInput("<CHANGED>.wav")
SpeechConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed
Dim recognizer As New SpeechRecognizer(SpeechConfig, audioConfig)
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
End Select
End Sub
End Module
您可以在 github 上下载音频文件
https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav
此外,如果您知道我可以在哪里获得更详细的 STT 数据,我将不胜感激。
我正在寻找的是一个 JSON 输出,其中包含开始时间和结束时间以及单词 and/or 句子。
非常感谢您的帮助。
更新
所以异步处理程序由于某种原因对我不起作用
然而,下面的代码做到了
While True
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
Exit While
End Select
End While
RecognizeOnceAsync
方法只会识别"once" ...音频数据文件中包含的第一个"utterance/phrase"。如果您想识别多个短语,可以执行以下两项操作之一:
重复调用RecognizeOnceAsync
...最后一个短语被识别后,下一次调用该方法将return一个result.Reason
设置为Canceled
.
从使用 RecognizeOnceAsync
切换到使用 StartContinuousRecognitionAsync
并将事件处理程序挂接到 Recognizing
事件。事件回调将允许您通过检查传递的 SpeechRecognitionEventArgs
来查看结果,如下所示:e.Result
...
您可以通过 运行 Speech CLI 看到这两种行为,如下所示:
spx recognize --once+ --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav"
spx recognize --continuous --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav"
您可以在此处下载语音 CLI:https://aka.ms/speech/spx-zips.zip
在我的代码(下方)中,当我通过 STT 处理它时,它只给我整个音频的第一个 alphabet/word。
音频有"A B C D E F"
我错过了什么?
Imports Microsoft.CognitiveServices.Speech
Imports Microsoft.CognitiveServices.Speech.SpeechConfig
Imports Microsoft.CognitiveServices.Speech.Audio
Module Module1
Sub Main()
Dim SpeechConfig As SpeechConfig = FromSubscription("<CHANGED>", "eastus")
Dim audioConfig As Audio.AudioConfig = Audio.AudioConfig.FromWavFileInput("<CHANGED>.wav")
SpeechConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed
Dim recognizer As New SpeechRecognizer(SpeechConfig, audioConfig)
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
End Select
End Sub
End Module
您可以在 github 上下载音频文件 https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav
此外,如果您知道我可以在哪里获得更详细的 STT 数据,我将不胜感激。 我正在寻找的是一个 JSON 输出,其中包含开始时间和结束时间以及单词 and/or 句子。
非常感谢您的帮助。
更新 所以异步处理程序由于某种原因对我不起作用 然而,下面的代码做到了
While True
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
Exit While
End Select
End While
RecognizeOnceAsync
方法只会识别"once" ...音频数据文件中包含的第一个"utterance/phrase"。如果您想识别多个短语,可以执行以下两项操作之一:
重复调用
RecognizeOnceAsync
...最后一个短语被识别后,下一次调用该方法将return一个result.Reason
设置为Canceled
.从使用
RecognizeOnceAsync
切换到使用StartContinuousRecognitionAsync
并将事件处理程序挂接到Recognizing
事件。事件回调将允许您通过检查传递的SpeechRecognitionEventArgs
来查看结果,如下所示:e.Result
...
您可以通过 运行 Speech CLI 看到这两种行为,如下所示:
spx recognize --once+ --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav"
spx recognize --continuous --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/WhosebugAssets/blob/master/abcdef.wav"
您可以在此处下载语音 CLI:https://aka.ms/speech/spx-zips.zip