Watson 语音转文本直播流 C# 代码示例

Watson speech to text live stream C# code example

我正在尝试用 C# 构建一个应用程序,它将获取音频流(目前来自文件,但稍后将是网络流)和 return 来自 Watson 的实时转录它们变得可用,类似于 https://speech-to-text-demo.mybluemix.net/

上的演示

有谁知道我在哪里可以找到一些示例代码(最好是 C# 代码)来帮助我入门?

我根据 https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1 上的有限文档尝试了此操作,但在调用 RecognizeWithSession 时收到 BadRequest 错误。我不确定我是否走在正确的道路上。

    static void StreamingRecognize(string filePath)
    {
        SpeechToTextService _speechToText = new SpeechToTextService();
        _speechToText.SetCredential(<user>, <pw>);
        var session = _speechToText.CreateSession("en-US_BroadbandModel");

        //returns initialized
        var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);

        //  set up observe
        var taskObserveResult = Task.Factory.StartNew(() =>
        {
            var result = _speechToText.ObserveResult(session.SessionId);
            return result;
        });

        //  get results
        taskObserveResult.ContinueWith((antecedent) =>
        {
            var results = antecedent.Result;
        });

        var metadata = new Metadata();
        metadata.PartContentType = "audio/wav";
        metadata.DataPartsCount = 1;
        metadata.Continuous = true;
        metadata.InactivityTimeout = -1;
        var taskRecognizeWithSession = Task.Factory.StartNew(() =>
        {
            using (FileStream fs = File.OpenRead(filePath))
            {
                _speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
            }
        });
    }

在 Watson Developer Cloud - SDK 中,在您的编程语言中,您可以看到一个名为 Examples 的文件夹,您可以访问使用 Speech to Text.

的示例

SDK 支持 WebSockets,这将满足您转录更实时而不是上传音频文件的要求。

static void Main(string[] args)
        {
            Transcribe();
            Console.WriteLine("Press any key to exit");
            Console.ReadLine();
        }

        // http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
        static String username = "<username>";
        static String password = "<password>";

        static String file = @"c:\audio.wav";

        static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");
        
        // these should probably be private classes that use DataContractJsonSerializer 
        // see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
        // or the ServiceState class at the end
        static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
            "{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
        ));
        static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
            "{\"action\": \"stop\"}"
        ));
        // ... more in the link below
  • 访问 SDK C# here
  • 有关详细信息,请参阅 API 参考资料 here
  • IBM Watson Developer 使用 Speech to Text 的一个完整示例 here