Azure 语音到带数字的文本
Azure speech to text with numbers
我的应用程序的一个用例是将语音(单个单词的表达)转换为文本。为此,我需要使用 Azure 语音来发短信。有时语音需要转换成整数——例如,我需要以数量的形式提交响应。
我的问题是,无论如何,是否有通过 REST API 告诉语音到文本服务我想要一个数字结果?目前它返回 'one' 而不是 '1' 和 'free' 而不是 '3' 之类的东西。我不认为有一种方法可以从文档中做到这一点,但我想在我想出解决方法之前看看是否有其他人解决了这个问题。
这是我在概念验证项目中使用的代码:
public static async Task SpeechToTextAsync(MemoryStream data, ISpeechResultCallback callBack)
{
string accessToken = await Authentication.GetAccessToken();
IToast toastWrapper = DependencyService.Get<IToast>();
if (accessToken != null)
{
toastWrapper.Show("Acquired token");
callBack.SpeechReturned("Acquired token");
using (var client = new HttpClient())
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-GB&format=detailed");
request.SendChunked = true;
request.Accept = @"application/json;text/xml";
request.Method = "POST";
request.ProtocolVersion = HttpVersion.Version11;
request.Host = "westus.stt.speech.microsoft.com";
request.ContentType = @"audio/wav; codecs=audio/pcm; samplerate=16000";
// request.Headers["Ocp-Apim-Subscription-Key"] = Program.SubscriptionKey;
request.Headers.Add("Authorization", "Bearer " + accessToken);
request.AllowWriteStreamBuffering = false;
data.Position = 0;
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
buffer = new Byte[checked((uint)Math.Min(1024, (int)data.Length))];
while ((bytesRead = data.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
// Flush
requestStream.Flush();
}
try
{
string responseData = null;
using (WebResponse response = request.GetResponse())
{
var encoding = Encoding.GetEncoding(((HttpWebResponse)response).CharacterSet);
using (var responseStream = response.GetResponseStream())
{
using (var reader = new StreamReader(responseStream, encoding))
{
responseData = reader.ReadToEnd();
AzureSTTResults deserializedProduct = JsonConvert.DeserializeObject<AzureSTTResults>(responseData);
if(deserializedProduct == null || deserializedProduct.NBest == null || deserializedProduct.NBest.Length == 0)
{
toastWrapper.Show("No results");
callBack.SpeechReturned("No results");
}
else
{
toastWrapper.Show(deserializedProduct.NBest[0].ITN);
callBack.SpeechReturned(deserializedProduct.NBest[0].ITN);
}
}
}
}
}
catch (Exception ex)
{
toastWrapper.Show(ex.Message);
callBack.SpeechReturned(ex.Message);
}
}
}
else
{
toastWrapper.Show("No token required");
callBack.SpeechReturned("No token required");
}
}
下面是我希望为“1”的结果示例:
{
"RecognitionStatus": "Success",
"Offset": 0,
"Duration": 22200000,
"NBest": [
{
"Confidence": 0.43084684014320374,
"Lexical": "one",
"ITN": "One",
"MaskedITN": "One",
"Display": "One."
}
]
}
根据官方文档Speech-to-text REST API
,没有选项可以帮助将数字单词转换为数字。
考虑到英语中的数字词在句法上有一定的规律性,可以用一个简单的算法来实现单词转数字的功能。作为参考,你可以按照下面的这些自己用C#写一个自己的。
- Converting words to numbers in c++
- Translate (Convert) Words to Numbers RRS feed 在 SQL 服务器
- Words in Numbers
希望对您有所帮助。
我建议使用此 nuget from Microsoft. It works like a charm, here 示例。
NumberRecognizer.RecognizeNumber("I have two apples", Culture.English)
我的应用程序的一个用例是将语音(单个单词的表达)转换为文本。为此,我需要使用 Azure 语音来发短信。有时语音需要转换成整数——例如,我需要以数量的形式提交响应。 我的问题是,无论如何,是否有通过 REST API 告诉语音到文本服务我想要一个数字结果?目前它返回 'one' 而不是 '1' 和 'free' 而不是 '3' 之类的东西。我不认为有一种方法可以从文档中做到这一点,但我想在我想出解决方法之前看看是否有其他人解决了这个问题。 这是我在概念验证项目中使用的代码:
public static async Task SpeechToTextAsync(MemoryStream data, ISpeechResultCallback callBack)
{
string accessToken = await Authentication.GetAccessToken();
IToast toastWrapper = DependencyService.Get<IToast>();
if (accessToken != null)
{
toastWrapper.Show("Acquired token");
callBack.SpeechReturned("Acquired token");
using (var client = new HttpClient())
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-GB&format=detailed");
request.SendChunked = true;
request.Accept = @"application/json;text/xml";
request.Method = "POST";
request.ProtocolVersion = HttpVersion.Version11;
request.Host = "westus.stt.speech.microsoft.com";
request.ContentType = @"audio/wav; codecs=audio/pcm; samplerate=16000";
// request.Headers["Ocp-Apim-Subscription-Key"] = Program.SubscriptionKey;
request.Headers.Add("Authorization", "Bearer " + accessToken);
request.AllowWriteStreamBuffering = false;
data.Position = 0;
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
buffer = new Byte[checked((uint)Math.Min(1024, (int)data.Length))];
while ((bytesRead = data.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
// Flush
requestStream.Flush();
}
try
{
string responseData = null;
using (WebResponse response = request.GetResponse())
{
var encoding = Encoding.GetEncoding(((HttpWebResponse)response).CharacterSet);
using (var responseStream = response.GetResponseStream())
{
using (var reader = new StreamReader(responseStream, encoding))
{
responseData = reader.ReadToEnd();
AzureSTTResults deserializedProduct = JsonConvert.DeserializeObject<AzureSTTResults>(responseData);
if(deserializedProduct == null || deserializedProduct.NBest == null || deserializedProduct.NBest.Length == 0)
{
toastWrapper.Show("No results");
callBack.SpeechReturned("No results");
}
else
{
toastWrapper.Show(deserializedProduct.NBest[0].ITN);
callBack.SpeechReturned(deserializedProduct.NBest[0].ITN);
}
}
}
}
}
catch (Exception ex)
{
toastWrapper.Show(ex.Message);
callBack.SpeechReturned(ex.Message);
}
}
}
else
{
toastWrapper.Show("No token required");
callBack.SpeechReturned("No token required");
}
}
下面是我希望为“1”的结果示例:
{
"RecognitionStatus": "Success",
"Offset": 0,
"Duration": 22200000,
"NBest": [
{
"Confidence": 0.43084684014320374,
"Lexical": "one",
"ITN": "One",
"MaskedITN": "One",
"Display": "One."
}
]
}
根据官方文档Speech-to-text REST API
,没有选项可以帮助将数字单词转换为数字。
考虑到英语中的数字词在句法上有一定的规律性,可以用一个简单的算法来实现单词转数字的功能。作为参考,你可以按照下面的这些自己用C#写一个自己的。
- Converting words to numbers in c++
- Translate (Convert) Words to Numbers RRS feed 在 SQL 服务器
- Words in Numbers
希望对您有所帮助。
我建议使用此 nuget from Microsoft. It works like a charm, here 示例。
NumberRecognizer.RecognizeNumber("I have two apples", Culture.English)