Speech-to-Text 短语超出字符数限制
Speech-to-Text Phrase Exceeds Character Limit
我正在使用 Python 的 Google Speech-to-Text 客户端库来使用语音自适应转换语音。我希望能够提升适合特定模式的短语。我已经使用 this documentation 创建自定义 类 和短语集,并将它们放在一个 SpeechAdaptation 对象中。
movement_words = ["move", "go", "turn", "rotate"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), movement_words))
movement_custom_class = CustomClass(name="Movement Words", custom_class_id="movement_words", items=class_items)
direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), direction_words))
direction_custom_class = CustomClass(name="Direction Words", custom_class_id="directions",items=class_items)
unit_words = ["meter","meters","feet","foot","degrees","radians"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), unit_words))
unit_custom_class = CustomClass(name="Unit Words", custom_class_id="units",items=class_items)
number_first_phrase = PhraseSet(name="number_first_phrase", phrases=[PhraseSet.Phrase(value="${movement_words} $OPERAND ${units} ${directions}")], boost=10)
speech_adaptation_object = SpeechAdaptation(
phrase_sets = [number_first_phrase],
phrase_set_references = [],
custom_classes = [movement_custom_class, direction_custom_class, unit_custom_class]
)
然后我在下面的 RecognitionConfig 中使用它,如下所示:
config = types.RecognitionConfig(
encoding=types.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=RATE,
language_code=language_code,
enable_automatic_punctuation=True,
adaptation=speech_adaptation_object
)
streaming_config = types.StreamingRecognitionConfig(
config=config,
interim_results=True)
但是,我收到以下错误消息:400 Invalid recognition 'config': Context phrase with 152 characters found, but max is 100.
我尝试将 PhraseSet 中的短语分成两半,从而停止了错误消息。但是,这让我质疑为什么在“${movement_words} $OPERAND ${units} ${directions}”甚至没有 100 个字符时检测到上下文短语有 152 个字符。我真的很感激任何指导来理解这里的字符限制是如何工作的。谢谢!
你的短语"${movement_words} $OPERAND ${units} ${directions}"
具有扩展变量({} 内的任何内容均指变量)
所以你数组中的所有单词都被展开了 -
现在这个短语很容易超过100个字符
movement_words = ["move", "go", "turn", "rotate"]
direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]
unit_words = ["meter","meters","feet","foot","degrees","radians"]
我正在使用 Python 的 Google Speech-to-Text 客户端库来使用语音自适应转换语音。我希望能够提升适合特定模式的短语。我已经使用 this documentation 创建自定义 类 和短语集,并将它们放在一个 SpeechAdaptation 对象中。
movement_words = ["move", "go", "turn", "rotate"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), movement_words))
movement_custom_class = CustomClass(name="Movement Words", custom_class_id="movement_words", items=class_items)
direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), direction_words))
direction_custom_class = CustomClass(name="Direction Words", custom_class_id="directions",items=class_items)
unit_words = ["meter","meters","feet","foot","degrees","radians"]
class_items = list(map(lambda word: CustomClass.ClassItem(value=word), unit_words))
unit_custom_class = CustomClass(name="Unit Words", custom_class_id="units",items=class_items)
number_first_phrase = PhraseSet(name="number_first_phrase", phrases=[PhraseSet.Phrase(value="${movement_words} $OPERAND ${units} ${directions}")], boost=10)
speech_adaptation_object = SpeechAdaptation(
phrase_sets = [number_first_phrase],
phrase_set_references = [],
custom_classes = [movement_custom_class, direction_custom_class, unit_custom_class]
)
然后我在下面的 RecognitionConfig 中使用它,如下所示:
config = types.RecognitionConfig(
encoding=types.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=RATE,
language_code=language_code,
enable_automatic_punctuation=True,
adaptation=speech_adaptation_object
)
streaming_config = types.StreamingRecognitionConfig(
config=config,
interim_results=True)
但是,我收到以下错误消息:400 Invalid recognition 'config': Context phrase with 152 characters found, but max is 100.
我尝试将 PhraseSet 中的短语分成两半,从而停止了错误消息。但是,这让我质疑为什么在“${movement_words} $OPERAND ${units} ${directions}”甚至没有 100 个字符时检测到上下文短语有 152 个字符。我真的很感激任何指导来理解这里的字符限制是如何工作的。谢谢!
你的短语"${movement_words} $OPERAND ${units} ${directions}"
具有扩展变量({} 内的任何内容均指变量)
所以你数组中的所有单词都被展开了 - 现在这个短语很容易超过100个字符
movement_words = ["move", "go", "turn", "rotate"]
direction_words = ["forward","forwards","backward","backwards","back","left","right","clockwise","counterclockwise","to the left","to the right"]
unit_words = ["meter","meters","feet","foot","degrees","radians"]