MS Speech Platform 11 Recognizer 是否支持 ARPA 编译语法?
Does the MS Speech Platform 11 Recognizer support ARPA compiled grammars?
如何在 MS Speech 中使用 ARPA 文件? Microsoft Speech Platform 11 Recognizer 的文档暗示可以从 ARPA 文件编译语法。
我能够编译一个 ARPA 文件——例如,小例子 provided by Microsoft——使用以下命令行:
CompileGrammar.exe -In stock.arpa -InFormat ARPA
我能够在以下测试中使用生成的 CFG 文件:
using Microsoft.Speech.Recognition;
// ...
using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
{
engine.LoadGrammar(new Grammar("stock.cfg"));
var result = engine.EmulateRecognize("will stock go up");
Assert.That(result, Is.Not.Null);
}
此测试通过,但请注意它使用 EmulateRecognize()
。当我切换到使用实际的音频文件时,如下所示:
using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
{
engine.LoadGrammar(new Grammar("stock.cfg"));
engine.SetInputToWaveFile("go-up.wav");
var result = engine.Recognize();
}
结果始终为空,测试失败。
Microsoft states quite clearly 它受支持,但即使是非常简单的示例似乎也不起作用。我做错了什么?
对于您的问题:
Does the MS Speech Platform 11 Recognizer support ARPA compiled
grammars?
答案是肯定的。
我这边的工作代码,只需更改三个属性:Culture/Grammar/WaveFile。我不知道你的完整代码,但根据我的测试和演示代码,我猜根本原因是我们需要处理我们这边的 SpeechRecognized,而你可能还没有完成你的身边。
static bool completed;
static void Main(string[] args)
{
// Initialize an in-process speech recognition engine.
using (SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
// Create and load a grammar.
Grammar dictation = new Grammar("stock.cfg");
dictation.Name = "Dictation Grammar";
recognizer.LoadGrammar(dictation);
// Configure the input to the recognizer.
recognizer.SetInputToWaveFile("test.wav");
// Attach event handlers for the results of recognition.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
recognizer.RecognizeCompleted +=
new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);
// Perform recognition on the entire file.
Console.WriteLine("Starting asynchronous recognition...");
completed = false;
recognizer.RecognizeAsync();
// Keep the console window open.
while (!completed)
{
Console.ReadLine();
}
Console.WriteLine("Done.");
}
Console.WriteLine();
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
// Handle the SpeechRecognized event.
static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result != null && e.Result.Text != null)
{
Console.WriteLine(" Recognized text = {0}", e.Result.Text);
}
else
{
Console.WriteLine(" Recognized text not available.");
}
}
// Handle the RecognizeCompleted event.
static void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
if (e.Error != null)
{
Console.WriteLine(" Error encountered, {0}: {1}",
e.Error.GetType().Name, e.Error.Message);
}
if (e.Cancelled)
{
Console.WriteLine(" Operation cancelled.");
}
if (e.InputStreamEnded)
{
Console.WriteLine(" End of stream encountered.");
}
completed = true;
}
而wav的内容就是“will stock go up”(时长约2秒)
这个问题有两种不同的答案,具体取决于您使用的 Microsoft Speech SDK 版本。 (参见:System.Speech.Recognition 和 Microsoft.Speech.Recognition 有什么区别?
)
System.Speech(桌面版)
在这种情况下,请参阅 。那里的示例代码很好用。
Microsoft.Speech(服务器版本)
可能因为服务器版本不包含 "dictation engine,",Microsoft.Speech 库显然永远不会 匹配 ARPA 来源的 CFG。但是,它仍然会假设 通过 SpeechRecognitionRejected
事件所说的内容。以下是 seiya1223 桌面代码的必要更改:
- 当然,将您的 using 语句从 System.Speech 更改为 Microsoft.Speech。
- 为
SpeechRecognitionRejected
事件添加事件处理程序。
- 在您的事件处理程序中,检查
e.Result.Text
属性 以获得最终假设。
以下代码片段有助于说明:
static string transcription;
static void Main(string[] args)
{
using (var recognizer = new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
engine.SpeechRecognitionRejected += SpeechRecognitionRejectedHandler;
// ...
}
}
void SpeechRecognitionRejectedHandler(object sender, SpeechRecognitionRejectedEventArgs e)
{
if (e.Result != null && !string.IsNullOrEmpty(e.Result.Text))
transcription = e.Result.Text;
}
该处理程序在识别结束时调用一次。例如,这是 seiya1223 代码的输出,但使用了所有可用的事件处理程序和一堆额外的日志记录(强调我的):
Starting asynchronous recognition...
In SpeechDetectedHandler:
- AudioPosition = 00:00:01.2300000
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = Go
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock Go
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock Go Up
In SpeechRecognitionRejectedHandler:
- Grammar Name = Stock; Result Text = will Stock Go Up
In RecognizeCompletedHandler.
- AudioPosition = 00:00:03.2000000; InputStreamEnded = True
- No result.
Done.
如何在 MS Speech 中使用 ARPA 文件? Microsoft Speech Platform 11 Recognizer 的文档暗示可以从 ARPA 文件编译语法。
我能够编译一个 ARPA 文件——例如,小例子 provided by Microsoft——使用以下命令行:
CompileGrammar.exe -In stock.arpa -InFormat ARPA
我能够在以下测试中使用生成的 CFG 文件:
using Microsoft.Speech.Recognition;
// ...
using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
{
engine.LoadGrammar(new Grammar("stock.cfg"));
var result = engine.EmulateRecognize("will stock go up");
Assert.That(result, Is.Not.Null);
}
此测试通过,但请注意它使用 EmulateRecognize()
。当我切换到使用实际的音频文件时,如下所示:
using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
{
engine.LoadGrammar(new Grammar("stock.cfg"));
engine.SetInputToWaveFile("go-up.wav");
var result = engine.Recognize();
}
结果始终为空,测试失败。
Microsoft states quite clearly 它受支持,但即使是非常简单的示例似乎也不起作用。我做错了什么?
对于您的问题:
Does the MS Speech Platform 11 Recognizer support ARPA compiled grammars?
答案是肯定的。
我这边的工作代码,只需更改三个属性:Culture/Grammar/WaveFile。我不知道你的完整代码,但根据我的测试和演示代码,我猜根本原因是我们需要处理我们这边的 SpeechRecognized,而你可能还没有完成你的身边。
static bool completed;
static void Main(string[] args)
{
// Initialize an in-process speech recognition engine.
using (SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
// Create and load a grammar.
Grammar dictation = new Grammar("stock.cfg");
dictation.Name = "Dictation Grammar";
recognizer.LoadGrammar(dictation);
// Configure the input to the recognizer.
recognizer.SetInputToWaveFile("test.wav");
// Attach event handlers for the results of recognition.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
recognizer.RecognizeCompleted +=
new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);
// Perform recognition on the entire file.
Console.WriteLine("Starting asynchronous recognition...");
completed = false;
recognizer.RecognizeAsync();
// Keep the console window open.
while (!completed)
{
Console.ReadLine();
}
Console.WriteLine("Done.");
}
Console.WriteLine();
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
// Handle the SpeechRecognized event.
static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result != null && e.Result.Text != null)
{
Console.WriteLine(" Recognized text = {0}", e.Result.Text);
}
else
{
Console.WriteLine(" Recognized text not available.");
}
}
// Handle the RecognizeCompleted event.
static void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
if (e.Error != null)
{
Console.WriteLine(" Error encountered, {0}: {1}",
e.Error.GetType().Name, e.Error.Message);
}
if (e.Cancelled)
{
Console.WriteLine(" Operation cancelled.");
}
if (e.InputStreamEnded)
{
Console.WriteLine(" End of stream encountered.");
}
completed = true;
}
而wav的内容就是“will stock go up”(时长约2秒)
这个问题有两种不同的答案,具体取决于您使用的 Microsoft Speech SDK 版本。 (参见:System.Speech.Recognition 和 Microsoft.Speech.Recognition 有什么区别? )
System.Speech(桌面版)
在这种情况下,请参阅
Microsoft.Speech(服务器版本)
可能因为服务器版本不包含 "dictation engine,",Microsoft.Speech 库显然永远不会 匹配 ARPA 来源的 CFG。但是,它仍然会假设 通过 SpeechRecognitionRejected
事件所说的内容。以下是 seiya1223 桌面代码的必要更改:
- 当然,将您的 using 语句从 System.Speech 更改为 Microsoft.Speech。
- 为
SpeechRecognitionRejected
事件添加事件处理程序。 - 在您的事件处理程序中,检查
e.Result.Text
属性 以获得最终假设。
以下代码片段有助于说明:
static string transcription;
static void Main(string[] args)
{
using (var recognizer = new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
engine.SpeechRecognitionRejected += SpeechRecognitionRejectedHandler;
// ...
}
}
void SpeechRecognitionRejectedHandler(object sender, SpeechRecognitionRejectedEventArgs e)
{
if (e.Result != null && !string.IsNullOrEmpty(e.Result.Text))
transcription = e.Result.Text;
}
该处理程序在识别结束时调用一次。例如,这是 seiya1223 代码的输出,但使用了所有可用的事件处理程序和一堆额外的日志记录(强调我的):
Starting asynchronous recognition...
In SpeechDetectedHandler:
- AudioPosition = 00:00:01.2300000
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = Go
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock Go
In SpeechHypothesizedHandler:
- Grammar Name = Stock; Result Text = will Stock Go Up
In SpeechRecognitionRejectedHandler:
- Grammar Name = Stock; Result Text = will Stock Go Up
In RecognizeCompletedHandler.
- AudioPosition = 00:00:03.2000000; InputStreamEnded = True
- No result.
Done.