Speech-to-Text 短语超出字符数限制

Question

我正在使用 Python 的 Google Speech-to-Text 客户端库来使用语音自适应转换语音。我希望能够提升适合特定模式的短语。我已经使用 this documentation 创建自定义类和短语集，并将它们放在一个 SpeechAdaptation 对象中。

movement_words = ["move", "go", "turn", "rotate"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), movement_words))
movement_custom_class = CustomClass(name="Movement Words", custom_class_id="movement_words", items=class_items)

direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), direction_words))
direction_custom_class = CustomClass(name="Direction Words", custom_class_id="directions",items=class_items)

unit_words = ["meter","meters","feet","foot","degrees","radians"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), unit_words))
unit_custom_class = CustomClass(name="Unit Words", custom_class_id="units",items=class_items)

number_first_phrase = PhraseSet(name="number_first_phrase", phrases=[PhraseSet.Phrase(value="${movement_words} $OPERAND ${units} ${directions}")], boost=10)

speech_adaptation_object = SpeechAdaptation(
  phrase_sets = [number_first_phrase],
  phrase_set_references = [],
  custom_classes = [movement_custom_class, direction_custom_class, unit_custom_class]
)

然后我在下面的 RecognitionConfig 中使用它，如下所示：

config = types.RecognitionConfig(
        encoding=types.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=RATE,
        language_code=language_code,
        enable_automatic_punctuation=True,
        adaptation=speech_adaptation_object
     )

    streaming_config = types.StreamingRecognitionConfig(
        config=config,
        interim_results=True)

但是，我收到以下错误消息：400 Invalid recognition 'config': Context phrase with 152 characters found, but max is 100. 我尝试将 PhraseSet 中的短语分成两半，从而停止了错误消息。但是，这让我质疑为什么在“${movement_words} $OPERAND ${units} ${directions}”甚至没有 100 个字符时检测到上下文短语有 152 个字符。我真的很感激任何指导来理解这里的字符限制是如何工作的。谢谢！

Answer 1

你的短语"${movement_words} $OPERAND ${units} ${directions}" 具有扩展变量（{} 内的任何内容均指变量）

所以你数组中的所有单词都被展开了 - 现在这个短语很容易超过100个字符

movement_words = ["move", "go", "turn", "rotate"]

direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]

unit_words = ["meter","meters","feet","foot","degrees","radians"]

Speech-to-Text 短语超出字符数限制

Speech-to-Text Phrase Exceeds Character Limit

python

google-cloud-speech