Watson 语音转文本直播流 C# 代码示例
Watson speech to text live stream C# code example
我正在尝试用 C# 构建一个应用程序,它将获取音频流(目前来自文件,但稍后将是网络流)和 return 来自 Watson 的实时转录它们变得可用,类似于 https://speech-to-text-demo.mybluemix.net/
上的演示
有谁知道我在哪里可以找到一些示例代码(最好是 C# 代码)来帮助我入门?
我根据 https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1 上的有限文档尝试了此操作,但在调用 RecognizeWithSession 时收到 BadRequest 错误。我不确定我是否走在正确的道路上。
static void StreamingRecognize(string filePath)
{
SpeechToTextService _speechToText = new SpeechToTextService();
_speechToText.SetCredential(<user>, <pw>);
var session = _speechToText.CreateSession("en-US_BroadbandModel");
//returns initialized
var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);
// set up observe
var taskObserveResult = Task.Factory.StartNew(() =>
{
var result = _speechToText.ObserveResult(session.SessionId);
return result;
});
// get results
taskObserveResult.ContinueWith((antecedent) =>
{
var results = antecedent.Result;
});
var metadata = new Metadata();
metadata.PartContentType = "audio/wav";
metadata.DataPartsCount = 1;
metadata.Continuous = true;
metadata.InactivityTimeout = -1;
var taskRecognizeWithSession = Task.Factory.StartNew(() =>
{
using (FileStream fs = File.OpenRead(filePath))
{
_speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
}
});
}
在 Watson Developer Cloud - SDK 中,在您的编程语言中,您可以看到一个名为 Examples 的文件夹,您可以访问使用 Speech to Text.
的示例
SDK 支持 WebSockets,这将满足您转录更实时而不是上传音频文件的要求。
static void Main(string[] args)
{
Transcribe();
Console.WriteLine("Press any key to exit");
Console.ReadLine();
}
// http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
static String username = "<username>";
static String password = "<password>";
static String file = @"c:\audio.wav";
static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");
// these should probably be private classes that use DataContractJsonSerializer
// see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
// or the ServiceState class at the end
static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
"{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
));
static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
"{\"action\": \"stop\"}"
));
// ... more in the link below
我正在尝试用 C# 构建一个应用程序,它将获取音频流(目前来自文件,但稍后将是网络流)和 return 来自 Watson 的实时转录它们变得可用,类似于 https://speech-to-text-demo.mybluemix.net/
上的演示有谁知道我在哪里可以找到一些示例代码(最好是 C# 代码)来帮助我入门?
我根据 https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1 上的有限文档尝试了此操作,但在调用 RecognizeWithSession 时收到 BadRequest 错误。我不确定我是否走在正确的道路上。
static void StreamingRecognize(string filePath)
{
SpeechToTextService _speechToText = new SpeechToTextService();
_speechToText.SetCredential(<user>, <pw>);
var session = _speechToText.CreateSession("en-US_BroadbandModel");
//returns initialized
var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);
// set up observe
var taskObserveResult = Task.Factory.StartNew(() =>
{
var result = _speechToText.ObserveResult(session.SessionId);
return result;
});
// get results
taskObserveResult.ContinueWith((antecedent) =>
{
var results = antecedent.Result;
});
var metadata = new Metadata();
metadata.PartContentType = "audio/wav";
metadata.DataPartsCount = 1;
metadata.Continuous = true;
metadata.InactivityTimeout = -1;
var taskRecognizeWithSession = Task.Factory.StartNew(() =>
{
using (FileStream fs = File.OpenRead(filePath))
{
_speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
}
});
}
在 Watson Developer Cloud - SDK 中,在您的编程语言中,您可以看到一个名为 Examples 的文件夹,您可以访问使用 Speech to Text.
的示例SDK 支持 WebSockets,这将满足您转录更实时而不是上传音频文件的要求。
static void Main(string[] args)
{
Transcribe();
Console.WriteLine("Press any key to exit");
Console.ReadLine();
}
// http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
static String username = "<username>";
static String password = "<password>";
static String file = @"c:\audio.wav";
static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");
// these should probably be private classes that use DataContractJsonSerializer
// see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
// or the ServiceState class at the end
static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
"{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
));
static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
"{\"action\": \"stop\"}"
));
// ... more in the link below