WinRT 中语音合成的不稳定行为

Erratic behaviour of voice synthesis in WinRT

我有一个使用语音合成的通用应用程序。 运行 在 WP8.1 下,它工作正常,但是当我尝试 Win8.1 时,我开始出现奇怪的行为。实际的声音似乎只说了一次,然而,在第二个 运行 时(在同一个应用程序中),以下代码挂起:

string toSay = "hello";
System.Diagnostics.Debug.WriteLine("{0}: Speak {1}", DateTime.Now, toSay);
using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    System.Diagnostics.Debug.WriteLine("{0}: After sythesizer instantiated", DateTime.Now);

    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);

    System.Diagnostics.Debug.WriteLine("{0}: After voice stream", DateTime.Now);

调试语句的原因是代码似乎具有不确定性原则。也就是说,当我通过它进行调试时,代码会执行并传递 SynthesizeTextToStreamAsync 语句。但是,当删除断点时,我只得到它前面的调试语句 - 而不是后面的语句。

我能推断出的最好结果是,在第一个 运行 期间发生了一些不好的事情(它确实声称完成并且实际上是第一次说话),然后它卡住了并且无法再玩了。完整代码类似于:

string toSay = "hello";
System.Diagnostics.Debug.WriteLine("{0}: Speak {1}", DateTime.Now, toSay);
using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    System.Diagnostics.Debug.WriteLine("{0}: After sythesizer instantiated", DateTime.Now);

    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);

    System.Diagnostics.Debug.WriteLine("{0}: After voice stream", DateTime.Now);

    MediaElement mediaElement;

    mediaElement = rootControl.Children.FirstOrDefault(a => a as MediaElement != null) as MediaElement;
    if (mediaElement == null)
    {
        mediaElement = new MediaElement();

        rootControl.Children.Add(mediaElement);
    }

    mediaElement.SetSource(voiceStream, voiceStream.ContentType);
    mediaElement.Volume = 1;
    mediaElement.IsMuted = false;

    var tcs = new TaskCompletionSource<bool>();                
    mediaElement.MediaEnded += (o, e) => { tcs.TrySetResult(true); };
    mediaElement.MediaFailed += (o, e) => { tcs.TrySetResult(true); };

    mediaElement.Play();                

    await tcs.Task;

我不确定您使用的是什么程序语言。但是,这可能会有所帮助。这是在 C# 中,因此这可以帮助您朝着正确的方向前进。

namespace Alexis
{
    public partial class frmMain : Form
    {

        SpeechRecognitionEngine _recognizer = new SpeechRecognitionEngine();
        SpeechSynthesizer Alexis = new SpeechSynthesizer();
        SpeechRecognitionEngine startlistening = new SpeechRecognitionEngine();
        DateTime timenow = DateTime.Now;
    }
  
  
  //other coding such as InitializeComponent and others.
  // 
  //
  //
  //
  
       private void frmMain_Load(object sender, EventArgs e)
        {

            _recognizer.SetInputToDefaultAudioDevice();
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(File.ReadAllLines(@"Default Commands.txt")))));
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Shell_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Social_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Web_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Default_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(AlarmClock_SpeechRecognized);
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(AlarmAM))));
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(AlarmPM))));
            _recognizer.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(_recognizer_SpeechDetected);
            _recognizer.RecognizeAsync(RecognizeMode.Multiple);

            startlistening.SetInputToDefaultAudioDevice();
            startlistening.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices("alexis"))));
            startlistening.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(startlistening_SpeechRecognized);

          
          //other stuff here..... Then once you have this then you can generate a method then with your code as follows
          //
          //
          //
          
          
          
           private void Default_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
              {
                  int ranNum;
                  string speech = e.Result.Text;
                  switch (speech)
                  {
                      #region Greetings
                      case "hello":
                      case "hello alexis":
                          timenow = DateTime.Now;
                          if (timenow.Hour >= 5 && timenow.Hour < 12)
                          { Alexis.SpeakAsync("Goodmorning " + Settings.Default.User); }
                          if (timenow.Hour >= 12 && timenow.Hour < 18)
                          { Alexis.SpeakAsync("Good afternoon " + Settings.Default.User); }
                          if (timenow.Hour >= 18 && timenow.Hour < 24)
                          { Alexis.SpeakAsync("Good evening " + Settings.Default.User); }
                          if (timenow.Hour < 5)
                          { Alexis.SpeakAsync("Hello " + Settings.Default.User + ", it's getting late"); }
                          break;

                      case "whats my name":
                      case "what is my name":
                          Alexis.SpeakAsync(Settings.Default.User);
                          break;

                      case "stop talking":
                      case "quit talking":
                          Alexis.SpeakAsyncCancelAll();
                          ranNum = rnd.Next(1, 2);
                          if (ranNum == 2)
                          { Alexis.Speak("sorry " + Settings.Default.User); }
                          break;
                    }
                }
          
          

而不是使用代码中的命令。我建议您使用文本文档。一旦有了它,就可以向其中添加自己的命令,然后将其放入代码中。还参考了System.Speech。

希望这有助于您走上正轨。

好的 - 我想我设法让这个工作...虽然我不确定为什么。

using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);
    MediaElement mediaElement; 
    mediaElement = rootControl.Children.FirstOrDefault(a => a as MediaElement != null) as MediaElement;
    if (mediaElement == null)
    {
        mediaElement = new MediaElement();

        rootControl.Children.Add(mediaElement);
    }

    mediaElement.SetSource(voiceStream, voiceStream.ContentType);
    mediaElement.Volume = 1;
    mediaElement.IsMuted = false;

    var tcs = new TaskCompletionSource<bool>();                
    mediaElement.MediaEnded += (o, e) => { tcs.TrySetResult(true); };               

    mediaElement.Play();                

    await tcs.Task;

    // Removing the control seems to free up whatever is locking 
    rootControl.Children.Remove(mediaElement);

}