MP3 AudioEncoding 不工作，我现在是运行 v1beta1 吗？

Question

我正在尝试使用本教程（部分，"Performing streaming speech recognition on a local file"）从流中录制音频运行：https://cloud.google.com/speech-to-text/docs/streaming-recognize

该文件是 M3U 文件，因此我尝试使用 RecognitionConfig.AudioEncoding.MP3 选项，但 MP3 属性被拒绝。当我尝试自动完成该选项时，MP3 也没有出现。

文档显示MP3属性仅在v1beta1版本（https://cloud.google.com/text-to-speech/docs/reference/rpc/google.cloud.texttospeech.v1beta1#google.cloud.texttospeech.v1beta1.AudioEncoding）中可用，我运行进行了pip升级。

我还需要做些什么来安装 v1beta1 吗？

Answer 1

请注意，关于 v1beta1，您分享的第二个 link 是针对 Text-to-Speech API 的您正在关注的示例的其他方式 (Speech-to-Text API).

在这种情况下，要使用 RecognitionConfig.AudioEncoding.MP3，您需要改用 v1p1beta1 版本。 pip 命令 (pip install --upgrade google-cloud-speech) 无需更改，但您需要在 Python 代码中导入正确的版本 (speech_v1p1beta1)：

# [START speech_transcribe_streaming]
def transcribe_streaming(stream_file):
    """Streams transcription of the given audio file."""
    import io
    from google.cloud import speech_v1p1beta1
    from google.cloud.speech_v1p1beta1 import enums
    from google.cloud.speech_v1p1beta1 import types
    client = speech_v1p1beta1.SpeechClient()

现在您可以使用 MP3 编码了：

    config = types.RecognitionConfig(
        encoding=enums.RecognitionConfig.AudioEncoding.MP3,
        sample_rate_hertz=16000,
        language_code='en-US')
    streaming_config = types.StreamingRecognitionConfig(config=config)

完整代码 here 但它只是包含先前更改的基本示例。

使用 MP3 样本测试：

$ python mp3.py sample.mp3
Finished: True
Stability: 0.0
Confidence: 0.9875912666320801
Transcript: I'm sorry Dave I'm afraid I can't do that

MP3 AudioEncoding 不工作，我现在是 运行 v1beta1 吗？

MP3 AudioEncoding not working, am I currently running v1beta1?

google-cloud-speech

MP3 AudioEncoding 不工作，我现在是运行 v1beta1 吗？