Google Dialogflow CX | StreamingDetectIntent 在匹配第一个意图后不处理音频

Question

环境详情

OS: Windows 10, 11. Debian 9 (stretch)
Node.js版本：12.18.3、12.22.1
npm 版本：7.19.0、7.15.0
@google-cloud/dialogflow-cx版本：2.13.0

问题

StreamingDetectIntent 在匹配第一个意图后不处理音频。我能够看到转录并且它能够匹配第一个意图但是在匹配第一个意图之后，音频继续流式传输但我没有收到转录，并且也没有触发 on('data') 回调。 简而言之，匹配第一个意图后没有任何反应

解决它的一件事是我必须结束 detectStream 然后重新初始化它。然后它按预期工作。

重现步骤

我试过 const {SessionsClient} = require("@google-cloud/dialogflow-cx"); 和 const {SessionsClient} = require("@google-cloud/dialogflow-cx").v3;

// Create a stream for the streaming request.
const detectStream = client
    .streamingDetectIntent()
    .on('error', console.error)
    .on('end', (data)=>{
        console.log(`streamingDetectIntent: -----End-----: ${JSON.stringify(data)}`);
    })
    .on('data', data => {
        console.log(`streamingDetectIntent: Data: ----------`);
        if (data.recognitionResult) {
            console.log(`Intermediate Transcript: ${data.recognitionResult.transcript}`);
        } else {
            console.log('Detected Intent:');
            if(!data.detectIntentResponse) return
            const result = data.detectIntentResponse.queryResult;

            console.log(`User Query: ${result.transcript}`);
            for (const message of result.responseMessages) {
                if (message.text) {
                    console.log(`Agent Response: ${message.text.text}`);
                }
            }
            if (result.match.intent) {
                console.log(`Matched Intent: ${result.match.intent.displayName}`);
            }
            console.log(`Current Page: ${result.currentPage.displayName}`);
        }
    });

const initialStreamRequest = {
        session: sessionPath,
        queryInput: {
            audio: {
                config: {
                    audioEncoding: encoding,
                    sampleRateHertz: sampleRateHertz,
                    singleUtterance: true,
                },
            },
            languageCode: languageCode,
        }
    };
detectStream.write(initialStreamRequest);

我试过通过文件 (.wav) 和使用麦克风流式传输音频，但结果相同。

await pump(
        recordingStream, // microphone stream <OR> fs.createReadStream(audioFileName),
        // Format the audio stream into the request format.
        new Transform({
            objectMode: true,
            transform: (obj, _, next) => {
                next(null, {queryInput: {audio: {audio: obj}}});
            },
        }),
        detectStream
    );

我也提到过这个 implementation and this rpc based doc 但找不到任何原因说明为什么这不起作用。

谢谢！

Answer 1

根据 documentation:

这似乎是正确的行为

When Dialogflow detects the audio's voice has stopped or paused, it ceases speech recognition and sends a StreamingDetectIntentResponse with a recognition result of END_OF_SINGLE_UTTERANCE to your client. Any audio sent to Dialogflow on the stream after receipt of END_OF_SINGLE_UTTERANCE is ignored by Dialogflow.

看来这就是 StreamingDetectIntent 在匹配第一个意图后不处理音频的原因。根据同一文档：

After closing a stream, your client should start a new request with a new stream as needed

您应该开始另一个流。您也可以查看同一主题中的其他 github issue。

Google Dialogflow CX | StreamingDetectIntent 在匹配第一个意图后不处理音频

Google Dialogflow CX | StreamingDetectIntent doesn't process audio after matching first intent

audio-streaming

node.js

google-cloud-platform

grpc

dialogflow-cx

环境详情

问题

重现步骤