如何使用 C# 库为 Dialogflow 发送音频 - DetectIntent

Question

我正在使用 Dialogflow C# 库 Google.Cloud.Dialogflow.V2 与我的 Dialogflow 代理通信。

使用 DetectIntentAsync()

发送 Text 时一切正常

我的问题是在发送 AUDIO 时，更准确地说是使用这种格式：.AAC

为了能够使用 DetectIntentAsync() 发送 audio，我们需要创建一个 DetectIntentRequest 如下所示


 DetectIntentRequest detectIntentRequest = new DetectIntentRequest
            {
                InputAudio = **HERE WHERE I HAVE AN ISSUE**,
                QueryInput = queryInput,
                Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
            };

其中 QueryInput 配置有 AudioConfig，如下所示

            QueryInput queryInput = new QueryInput
            {
                AudioConfig = audioConfig,
            };

其中 AudioConfig 配置如下

   var audioConfig= new InputAudioConfig
            {
                AudioEncoding = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT ENCODING**,
                LanguageCode = "en-US",
                ModelVariant = SpeechModelVariant.Unspecified,
                SampleRateHertz = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT SAMPLE RATE HERTZ**,
            };

问题

如何确定要选择什么 SampleRateHertz？
如何确定要选择什么 AudioEncoding？
如何将正确的Protobuf.ByteString提供给InputAudio？
如果我想使用.AAC 以外的其他格式怎么办，如何自动提供所需的信息？

我测试了什么

我从 URL

得到了 byte[]

// THE AUDIO IS A .AAC FILE
string audio = "https://cdn.fbsbx.com/v/t59.3654-21/72342591_3243833722299817_3308062589669343232_n.aac/audioclip-1575911942672-2279.aac?_nc_cat=102&_nc_ohc=heP60KND_DMAQl5-tE77rKNtUzHw_aILXdKfPPejdr7YVqzbYLQRv9BWA&_nc_ht=cdn.fbsbx.com&oh=1c4dbf0a64e0d1fb057b79354c17ca1c&oe=5DF17429";
byte[] audioBytes;
            using (var webClient = new WebClient())
            {
                audioBytes = webClient.DownloadData(audio);
            }

然后我将其添加到 DetectIntentRequest 中，如下所示

DetectIntentRequest detectIntentRequest = new DetectIntentRequest
            {
                InputAudio = Google.Protobuf.ByteString.CopyFrom(audioBytes),
                QueryInput = queryInput,
                Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
            };

如果我没有指定 SampleRateHertz，我会收到以下错误：

错误："{"Status(StatusCode=InvalidArgument, Detail=\"输入音频或配置无效。无法计算音频持续时间。可能没有发送音频数据。\")"} "

我 stopped 在 我指定了一个 SampleRateHertz 值时收到错误，但这是无论我使用什么值我都会得到的响应在 AudioEncoding 和 SampleRateHertz 中：

响应：{{ "languageCode"："en" }}

DetectIntentResponse 中的所有其他内容均为空

Guidance/Help 表示赞赏

谢谢

Answer 1

DialogFlow 目前不支持AAC 编解码器。您可以在 documentation. If you can not change the input file, you'll have to transcode it (preferably using a library, e.g. NAudio). The correct value to enter for AudioEncoding can be found in the API reference.

中找到支持的格式列表

同一页还有sample rates to use的信息。这些取决于提供的输入，因此对于标准波形容器格式（FLAC 和 WAV），它们包含在文件中，因此是可选的。对于其他文件，值 必须与其中音频的采样率 相匹配，并且通常包含在文件头中。同样，针对每种格式手动阅读它很痛苦，因此要么使用库，要么确保所有输入文件具有相同的采样率并对其进行硬编码。

Answer 2

对于那些面临 dialogflow 的 .AAC 问题的人，我设法让它像下面这样工作：

 DetectIntentResponse response = new DetectIntentResponse();
            var queryAudio = new InputAudioConfig
            {
                LanguageCode = LanguageCode,
                ModelVariant = SpeechModelVariant.Unspecified,
            };

            QueryInput queryInput = new QueryInput
            {
                AudioConfig = queryAudio,
            };

                var filename = "fileName".wav";
                // userAudioInput is the .AAC string URL 
                // creating and saving the wav format from AAC
                using (var reader = new MediaFoundationReader(userAudioInput))
                {
                    Directory.CreateDirectory(path);
                    WaveFileWriter.CreateWaveFile(path + "/" + filename, reader);
                }
                // Reading the previously saved wav file
                byte[] inputAudio = File.ReadAllBytes(path + "/" + filename);

                DetectIntentRequest detectIntentRequest = new DetectIntentRequest
                {
                    //InputAudio = Google.Protobuf.ByteString.CopyFrom(ReadFully(outputStreamMono)),
                    InputAudio = Google.Protobuf.ByteString.CopyFrom(inputAudio),
                    QueryInput = queryInput,
                    Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
                };

                // Make the request
                response = await _sessionsClient.DetectIntentAsync(detectIntentRequest);

如何使用 C# 库为 Dialogflow 发送音频 - DetectIntent

How to send an Audio for Dialogflow using C# library - DetectIntent

c#

audio

naudio

dialogflow-es