使用自定义 .gram 文件使 Sphinx4 识别所有数字
Make Sphinx4 Recognize all the numbers using custom .gram file
描述
存在 Java 使用 Sphinx4 library 的语音识别器计算器。
github 上的完整代码:here
我使用的gram文件如下(on github):
#JSGF V1.0;
/**
* JSGF Grammar
*/
grammar grammar;
public <syntax> = (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
(plus | minus | multiply | division)
(one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty);
问题:
I want the program to be able to recognize numbers from 0 to 1 million in English Language
。
在当前状态下,如您所见,它可以识别数字 (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
,因为我已将它们手动写入 gram file
。
我的意思是我必须将它们全部手动写入 gram file
(我可以创建一个程序来生成该文件)但是这似乎又是不可能的(可能存在某种模式),该文件太千兆字节。
最后:
有什么聪明的解决办法吗?谢谢你的努力:)
Nikolay
解法后的新语法为:
public <number> = (one | two | three | four | five | six | seven | nine | ten
| eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty
| thirty | forty | fifty | sixty | seventy | eighty | ninety | hundred | thousand | million | billion)+;
public <syntax> = <number>{1} (plus | minus | multiply | division){1} <number>{1};
最聪明的解决方案是首先识别文本字符串。语法不应该很复杂,它应该只是列出数字中使用的单词:
grammar number;
public <number> = (one | two | three | four | five | six | seven |
nine | ten | eleven | twelve | thirteen | fourteen | fifteen |
sixteen | seventeen | eighteen | nineteen | twenty | thirty | forty |
fifty | sixty | seventy | eighty | ninety | hundred | thousand |
million | and )*;
识别文本后,将其转换为数字。您可以查看How to convert words to a number?了解详情。
描述
存在 Java 使用 Sphinx4 library 的语音识别器计算器。
github 上的完整代码:here
我使用的gram文件如下(on github):
#JSGF V1.0;
/**
* JSGF Grammar
*/
grammar grammar;
public <syntax> = (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
(plus | minus | multiply | division)
(one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty);
问题:
I want the program to be able to recognize numbers from 0 to 1 million in English Language
。
在当前状态下,如您所见,它可以识别数字 (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
,因为我已将它们手动写入 gram file
。
我的意思是我必须将它们全部手动写入 gram file
(我可以创建一个程序来生成该文件)但是这似乎又是不可能的(可能存在某种模式),该文件太千兆字节。
最后:
有什么聪明的解决办法吗?谢谢你的努力:)
Nikolay
解法后的新语法为:
public <number> = (one | two | three | four | five | six | seven | nine | ten
| eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty
| thirty | forty | fifty | sixty | seventy | eighty | ninety | hundred | thousand | million | billion)+;
public <syntax> = <number>{1} (plus | minus | multiply | division){1} <number>{1};
最聪明的解决方案是首先识别文本字符串。语法不应该很复杂,它应该只是列出数字中使用的单词:
grammar number;
public <number> = (one | two | three | four | five | six | seven |
nine | ten | eleven | twelve | thirteen | fourteen | fifteen |
sixteen | seventeen | eighteen | nineteen | twenty | thirty | forty |
fifty | sixty | seventy | eighty | ninety | hundred | thousand |
million | and )*;
识别文本后,将其转换为数字。您可以查看How to convert words to a number?了解详情。