如何获取 Google Cloud Speech（语音到文本）以识别字母和声音

Question

有没有办法让 Google Cloud Speech API 识别字母和字母发音？

作为一个示例用例，如果我想构建一个拼写游戏，其中一个声音会说 "Spell restaurant" 并且识别器会监听每个字母并在它们出现时识别它们。

同样，有没有办法识别特定字母的发音，例如："oo"、"ew"、"k"（如 cat）或 "s"（如 circle ).

Answer 1

至少在某些情况下，它似乎已经做了合理的工作。例如，当我拼出 "cee ay tee" 时，它会识别 "c a t"。也可以按照 post:

中的描述提供 "word hints"

Google Cloud Speech API word Hints

提供单字母列表 "words" 作为提示，即

phrases = ['a', 'b', 'c', 'd' ... ]

这方面的结果似乎有所改善。

How to get Google Cloud Speech (voice-to-text) to recognize letters and sounds