WinRT 中语音合成的不稳定行为

Question

我有一个使用语音合成的通用应用程序。运行在 WP8.1 下，它工作正常，但是当我尝试 Win8.1 时，我开始出现奇怪的行为。实际的声音似乎只说了一次，然而，在第二个运行时（在同一个应用程序中），以下代码挂起：

string toSay = "hello";
System.Diagnostics.Debug.WriteLine("{0}: Speak {1}", DateTime.Now, toSay);
using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    System.Diagnostics.Debug.WriteLine("{0}: After sythesizer instantiated", DateTime.Now);

    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);

    System.Diagnostics.Debug.WriteLine("{0}: After voice stream", DateTime.Now);

调试语句的原因是代码似乎具有不确定性原则。也就是说，当我通过它进行调试时，代码会执行并传递 SynthesizeTextToStreamAsync 语句。但是，当删除断点时，我只得到它前面的调试语句 - 而不是后面的语句。

我能推断出的最好结果是，在第一个运行期间发生了一些不好的事情（它确实声称完成并且实际上是第一次说话），然后它卡住了并且无法再玩了。完整代码类似于：

string toSay = "hello";
System.Diagnostics.Debug.WriteLine("{0}: Speak {1}", DateTime.Now, toSay);
using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    System.Diagnostics.Debug.WriteLine("{0}: After sythesizer instantiated", DateTime.Now);

    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);

    System.Diagnostics.Debug.WriteLine("{0}: After voice stream", DateTime.Now);

    MediaElement mediaElement;

    mediaElement = rootControl.Children.FirstOrDefault(a => a as MediaElement != null) as MediaElement;
    if (mediaElement == null)
    {
        mediaElement = new MediaElement();

        rootControl.Children.Add(mediaElement);
    }

    mediaElement.SetSource(voiceStream, voiceStream.ContentType);
    mediaElement.Volume = 1;
    mediaElement.IsMuted = false;

    var tcs = new TaskCompletionSource<bool>();                
    mediaElement.MediaEnded += (o, e) => { tcs.TrySetResult(true); };
    mediaElement.MediaFailed += (o, e) => { tcs.TrySetResult(true); };

    mediaElement.Play();                

    await tcs.Task;

Answer 1

我不确定您使用的是什么程序语言。但是，这可能会有所帮助。这是在 C# 中，因此这可以帮助您朝着正确的方向前进。

namespace Alexis
{
    public partial class frmMain : Form
    {

        SpeechRecognitionEngine _recognizer = new SpeechRecognitionEngine();
        SpeechSynthesizer Alexis = new SpeechSynthesizer();
        SpeechRecognitionEngine startlistening = new SpeechRecognitionEngine();
        DateTime timenow = DateTime.Now;
    }
  
  
  //other coding such as InitializeComponent and others.
  // 
  //
  //
  //
  
       private void frmMain_Load(object sender, EventArgs e)
        {

            _recognizer.SetInputToDefaultAudioDevice();
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(File.ReadAllLines(@"Default Commands.txt")))));
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Shell_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Social_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Web_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Default_SpeechRecognized);
            _recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(AlarmClock_SpeechRecognized);
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(AlarmAM))));
            _recognizer.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices(AlarmPM))));
            _recognizer.SpeechDetected += new EventHandler<SpeechDetectedEventArgs>(_recognizer_SpeechDetected);
            _recognizer.RecognizeAsync(RecognizeMode.Multiple);

            startlistening.SetInputToDefaultAudioDevice();
            startlistening.LoadGrammarAsync(new Grammar(new GrammarBuilder(new Choices("alexis"))));
            startlistening.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(startlistening_SpeechRecognized);

          
          //other stuff here..... Then once you have this then you can generate a method then with your code as follows
          //
          //
          //
          
          
          
           private void Default_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
              {
                  int ranNum;
                  string speech = e.Result.Text;
                  switch (speech)
                  {
                      #region Greetings
                      case "hello":
                      case "hello alexis":
                          timenow = DateTime.Now;
                          if (timenow.Hour >= 5 && timenow.Hour < 12)
                          { Alexis.SpeakAsync("Goodmorning " + Settings.Default.User); }
                          if (timenow.Hour >= 12 && timenow.Hour < 18)
                          { Alexis.SpeakAsync("Good afternoon " + Settings.Default.User); }
                          if (timenow.Hour >= 18 && timenow.Hour < 24)
                          { Alexis.SpeakAsync("Good evening " + Settings.Default.User); }
                          if (timenow.Hour < 5)
                          { Alexis.SpeakAsync("Hello " + Settings.Default.User + ", it's getting late"); }
                          break;

                      case "whats my name":
                      case "what is my name":
                          Alexis.SpeakAsync(Settings.Default.User);
                          break;

                      case "stop talking":
                      case "quit talking":
                          Alexis.SpeakAsyncCancelAll();
                          ranNum = rnd.Next(1, 2);
                          if (ranNum == 2)
                          { Alexis.Speak("sorry " + Settings.Default.User); }
                          break;
                    }
                }

而不是使用代码中的命令。我建议您使用文本文档。一旦有了它，就可以向其中添加自己的命令，然后将其放入代码中。还参考了System.Speech。

希望这有助于您走上正轨。

Answer 2

好的 - 我想我设法让这个工作...虽然我不确定为什么。

using (SpeechSynthesizer synth = new SpeechSynthesizer())
{
    var voiceStream = await synth.SynthesizeTextToStreamAsync(toSay);
    MediaElement mediaElement; 
    mediaElement = rootControl.Children.FirstOrDefault(a => a as MediaElement != null) as MediaElement;
    if (mediaElement == null)
    {
        mediaElement = new MediaElement();

        rootControl.Children.Add(mediaElement);
    }

    mediaElement.SetSource(voiceStream, voiceStream.ContentType);
    mediaElement.Volume = 1;
    mediaElement.IsMuted = false;

    var tcs = new TaskCompletionSource<bool>();                
    mediaElement.MediaEnded += (o, e) => { tcs.TrySetResult(true); };               

    mediaElement.Play();                

    await tcs.Task;

    // Removing the control seems to free up whatever is locking 
    rootControl.Children.Remove(mediaElement);

}

WinRT 中语音合成的不稳定行为

Erratic behaviour of voice synthesis in WinRT

c#

voice

windows-runtime

windows-phone-8.1

win-universal-app