如何使用 C# 库为 Dialogflow 发送音频 - DetectIntent
How to send an Audio for Dialogflow using C# library - DetectIntent
我正在使用 Dialogflow C# 库 Google.Cloud.Dialogflow.V2 与我的 Dialogflow 代理通信。
使用 DetectIntentAsync()
发送 Text 时一切正常
我的问题是在发送 AUDIO 时,更准确地说是使用这种格式:.AAC
为了能够使用 DetectIntentAsync() 发送 audio,我们需要创建一个 DetectIntentRequest 如下所示
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = **HERE WHERE I HAVE AN ISSUE**,
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
其中 QueryInput 配置有 AudioConfig,如下所示
QueryInput queryInput = new QueryInput
{
AudioConfig = audioConfig,
};
其中 AudioConfig 配置如下
var audioConfig= new InputAudioConfig
{
AudioEncoding = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT ENCODING**,
LanguageCode = "en-US",
ModelVariant = SpeechModelVariant.Unspecified,
SampleRateHertz = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT SAMPLE RATE HERTZ**,
};
问题
- 如何确定要选择什么 SampleRateHertz?
- 如何确定要选择什么 AudioEncoding?
- 如何将正确的Protobuf.ByteString提供给InputAudio?
- 如果我想使用.AAC 以外的其他格式怎么办,如何自动提供所需的信息?
我测试了什么
我从 URL
得到了 byte[]
// THE AUDIO IS A .AAC FILE
string audio = "https://cdn.fbsbx.com/v/t59.3654-21/72342591_3243833722299817_3308062589669343232_n.aac/audioclip-1575911942672-2279.aac?_nc_cat=102&_nc_ohc=heP60KND_DMAQl5-tE77rKNtUzHw_aILXdKfPPejdr7YVqzbYLQRv9BWA&_nc_ht=cdn.fbsbx.com&oh=1c4dbf0a64e0d1fb057b79354c17ca1c&oe=5DF17429";
byte[] audioBytes;
using (var webClient = new WebClient())
{
audioBytes = webClient.DownloadData(audio);
}
然后我将其添加到 DetectIntentRequest 中,如下所示
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = Google.Protobuf.ByteString.CopyFrom(audioBytes),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
如果我没有指定 SampleRateHertz,我会收到以下错误:
错误:"{"Status(StatusCode=InvalidArgument, Detail=\"输入音频或配置无效。无法计算音频持续时间。可能没有发送音频数据。\")"} "
我 stopped 在 我指定了一个 SampleRateHertz 值时收到错误,但这是无论我使用什么值我都会得到的响应在 AudioEncoding 和 SampleRateHertz 中:
响应:{{ "languageCode":"en" }}
DetectIntentResponse 中的所有其他内容均为空
Guidance/Help 表示赞赏
谢谢
DialogFlow 目前不支持AAC 编解码器。您可以在 documentation. If you can not change the input file, you'll have to transcode it (preferably using a library, e.g. NAudio). The correct value to enter for AudioEncoding
can be found in the API reference.
中找到支持的格式列表
同一页还有sample rates to use的信息。这些取决于提供的输入,因此对于标准波形容器格式(FLAC 和 WAV),它们包含在文件中,因此是可选的。对于其他文件,值 必须与其中音频的采样率 相匹配,并且通常包含在文件头中。同样,针对每种格式手动阅读它很痛苦,因此要么使用库,要么确保所有输入文件具有相同的采样率并对其进行硬编码。
对于那些面临 dialogflow 的 .AAC 问题的人,我设法让它像下面这样工作:
DetectIntentResponse response = new DetectIntentResponse();
var queryAudio = new InputAudioConfig
{
LanguageCode = LanguageCode,
ModelVariant = SpeechModelVariant.Unspecified,
};
QueryInput queryInput = new QueryInput
{
AudioConfig = queryAudio,
};
var filename = "fileName".wav";
// userAudioInput is the .AAC string URL
// creating and saving the wav format from AAC
using (var reader = new MediaFoundationReader(userAudioInput))
{
Directory.CreateDirectory(path);
WaveFileWriter.CreateWaveFile(path + "/" + filename, reader);
}
// Reading the previously saved wav file
byte[] inputAudio = File.ReadAllBytes(path + "/" + filename);
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
//InputAudio = Google.Protobuf.ByteString.CopyFrom(ReadFully(outputStreamMono)),
InputAudio = Google.Protobuf.ByteString.CopyFrom(inputAudio),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
// Make the request
response = await _sessionsClient.DetectIntentAsync(detectIntentRequest);
我正在使用 Dialogflow C# 库 Google.Cloud.Dialogflow.V2 与我的 Dialogflow 代理通信。
使用 DetectIntentAsync()
发送 Text 时一切正常我的问题是在发送 AUDIO 时,更准确地说是使用这种格式:.AAC
为了能够使用 DetectIntentAsync() 发送 audio,我们需要创建一个 DetectIntentRequest 如下所示
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = **HERE WHERE I HAVE AN ISSUE**,
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
其中 QueryInput 配置有 AudioConfig,如下所示
QueryInput queryInput = new QueryInput
{
AudioConfig = audioConfig,
};
其中 AudioConfig 配置如下
var audioConfig= new InputAudioConfig
{
AudioEncoding = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT ENCODING**,
LanguageCode = "en-US",
ModelVariant = SpeechModelVariant.Unspecified,
SampleRateHertz = **HAVING ISSUE HERE ON HOW TO CHOOSE THE CORRECT SAMPLE RATE HERTZ**,
};
问题
- 如何确定要选择什么 SampleRateHertz?
- 如何确定要选择什么 AudioEncoding?
- 如何将正确的Protobuf.ByteString提供给InputAudio?
- 如果我想使用.AAC 以外的其他格式怎么办,如何自动提供所需的信息?
我测试了什么
我从 URL
得到了 byte[]// THE AUDIO IS A .AAC FILE
string audio = "https://cdn.fbsbx.com/v/t59.3654-21/72342591_3243833722299817_3308062589669343232_n.aac/audioclip-1575911942672-2279.aac?_nc_cat=102&_nc_ohc=heP60KND_DMAQl5-tE77rKNtUzHw_aILXdKfPPejdr7YVqzbYLQRv9BWA&_nc_ht=cdn.fbsbx.com&oh=1c4dbf0a64e0d1fb057b79354c17ca1c&oe=5DF17429";
byte[] audioBytes;
using (var webClient = new WebClient())
{
audioBytes = webClient.DownloadData(audio);
}
然后我将其添加到 DetectIntentRequest 中,如下所示
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
InputAudio = Google.Protobuf.ByteString.CopyFrom(audioBytes),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
如果我没有指定 SampleRateHertz,我会收到以下错误:
错误:"{"Status(StatusCode=InvalidArgument, Detail=\"输入音频或配置无效。无法计算音频持续时间。可能没有发送音频数据。\")"} "
我 stopped 在 我指定了一个 SampleRateHertz 值时收到错误,但这是无论我使用什么值我都会得到的响应在 AudioEncoding 和 SampleRateHertz 中:
响应:{{ "languageCode":"en" }}
DetectIntentResponse 中的所有其他内容均为空
Guidance/Help 表示赞赏
谢谢
DialogFlow 目前不支持AAC 编解码器。您可以在 documentation. If you can not change the input file, you'll have to transcode it (preferably using a library, e.g. NAudio). The correct value to enter for AudioEncoding
can be found in the API reference.
同一页还有sample rates to use的信息。这些取决于提供的输入,因此对于标准波形容器格式(FLAC 和 WAV),它们包含在文件中,因此是可选的。对于其他文件,值 必须与其中音频的采样率 相匹配,并且通常包含在文件头中。同样,针对每种格式手动阅读它很痛苦,因此要么使用库,要么确保所有输入文件具有相同的采样率并对其进行硬编码。
对于那些面临 dialogflow 的 .AAC 问题的人,我设法让它像下面这样工作:
DetectIntentResponse response = new DetectIntentResponse();
var queryAudio = new InputAudioConfig
{
LanguageCode = LanguageCode,
ModelVariant = SpeechModelVariant.Unspecified,
};
QueryInput queryInput = new QueryInput
{
AudioConfig = queryAudio,
};
var filename = "fileName".wav";
// userAudioInput is the .AAC string URL
// creating and saving the wav format from AAC
using (var reader = new MediaFoundationReader(userAudioInput))
{
Directory.CreateDirectory(path);
WaveFileWriter.CreateWaveFile(path + "/" + filename, reader);
}
// Reading the previously saved wav file
byte[] inputAudio = File.ReadAllBytes(path + "/" + filename);
DetectIntentRequest detectIntentRequest = new DetectIntentRequest
{
//InputAudio = Google.Protobuf.ByteString.CopyFrom(ReadFully(outputStreamMono)),
InputAudio = Google.Protobuf.ByteString.CopyFrom(inputAudio),
QueryInput = queryInput,
Session = "projects/" + _sessionName.ProjectId + "/agent/sessions/" + _sessionName.SessionId
};
// Make the request
response = await _sessionsClient.DetectIntentAsync(detectIntentRequest);