如何使用 Google Cloud Speech (V1 API) 进行语音转文本 - 需要能够正确有效地处理超过 3 小时的音频文件

Question

我正在寻找文档之类的东西，但还没有找到解决方案

已安装 NuGet 包

还生成了API密钥

但是找不到如何使用 API 密钥的正确文档

此外，我希望能够上传很长的音频文件

那么上传长达 3 小时的音频文件并获得结果的正确方法是什么？

我有 300 美元的预算所以应该足够了

这是我目前的代码

此代码目前失败，因为我目前没有正确设置凭据，我不知道如何设置

我还有可以使用的服务帐户文件

public partial class MainWindow : Window
{
    public MainWindow()
    {
        InitializeComponent();
    }

    private void Button_Click(object sender, RoutedEventArgs e)
    {
        var speech = SpeechClient.Create();           
        
        var config = new RecognitionConfig
        {               
            Encoding = RecognitionConfig.Types.AudioEncoding.Flac,
            SampleRateHertz = 48000,
            LanguageCode = LanguageCodes.English.UnitedStates
        };
        var audio = RecognitionAudio.FromStorageUri("1m.flac");

        var response = speech.Recognize(config, audio);

        foreach (var result in response.Results)
        {
            foreach (var alternative in result.Alternatives)
            {
                Debug.WriteLine(alternative.Transcript);
            }
        }
    }
}

我不想设置环境变量。我有 API 密钥和服务帐户 json 文件。如何手动设置？

Answer 1

创建Speech实例前需要设置环境变量：

 Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", "text-tospeech.json");

第二个参数 (text-tospeech.json) 是您的文件，其中包含 Google Api.

生成的凭据

Answer 2

如果您不想使用环境变量，则需要使用 SpeechClientBuilder 创建带有自定义凭据的 SpeechClient。假设您在某处有一个服务帐户文件，更改此：

var speech = SpeechClient.Create();

对此：

var speech = new SpeechClientBuilder
{
    CredentialsPath = "/path/to/your/file"
}.Build();

请注意，要执行 long-running 识别操作，您还应该使用 LongRunningRecognize 方法 - 我强烈怀疑您当前的 RPC 会失败，因为它试图运行在太大的文件上，否则它会超时。

如何使用 Google Cloud Speech (V1 API) 进行语音转文本 - 需要能够正确有效地处理超过 3 小时的音频文件

How to use Google Cloud Speech (V1 API) for speech to text - need to be able to process over 3 hours audio files properly and efficiently

.net

c#

speech-to-text

google-speech-api

google-speech-to-text-api