您可以将 Google Speech 的模型版本固定到文本吗?

Can you pin the model version of Google Speech to Text?

我想使用 Google 语音将音频转录为文本 API (STT),但我需要转录随着时间​​的推移保持一致。换句话说,即使 Google 改进了 STT 模型,是否可以固定我最初使用的 STT 模型的版本,以便转录保持一致?我正在使用 Google 语音 Python 客户端库。

遗憾的是,无法定义特定版本的 STT 模型。我建议保持一致的是定义要在您的 STT RecognitionConfig().

上使用的 model

model

Which model to select for the given request. Select the model best suited to your domain to get best results. If a model is not explicitly specified, then we auto-select a model based on the parameters in the RecognitionConfig.

Model | Description

  • command_and_search | Best for short queries such as voice commands or voice search.

  • phone_call | Best for audio that originated from a phone call (typically recorded at an 8khz sampling rate).

  • video | Best for audio that originated from video or includes multiple speakers. Ideally the audio is recorded at a 16khz or greater sampling rate. This is a premium model that costs more than the standard rate.

  • default | Best for audio that is not one of the specific audio models. For example, long-form audio. Ideally the audio is high-fidelity, recorded at a 16khz or greater sampling rate.