蔚蓝 "Text-To-Speech" returns "Invalid CID or language"。这是什么意思？

Question

我正在尝试 post 到 Azure 文本转语音服务。我已经获得了访问令牌，现在我正在尝试调用将文本转换为语音（使用 Unity 中的最佳 HTTP）：

            HTTPRequest request = new HTTPRequest(new Uri(APIEndpointURL), HTTPMethods.Post, _GotTextToSpeechResponse);

        request.AddHeader("Authorization", "Bearer " + accessToken);
        request.AddHeader("Content-Type", "application/ssml+xml");
        request.AddHeader("X-Microsoft-OutputFormat", "raw-16khz-16bit-mono-pcm");
        request.AddHeader("User-Agent", "My app name");

        request.RawData = Encoding.UTF8.GetBytes("Hello world!");
        request.Send();

此 returns 状态代码 400 具有以下内容：

{"Message":"Invalid CID or language"}"

文档说如果我不定义语言而只是发送文本，它应该使用默认语音。然后，"User-Agent" 应该是 "Application name"。文档没有说明这是否应该在某处预定义或它指的是什么。

错误意味着什么以及如何解决？当我 post 作为 "Raw data" 时，我做错了吗？它说我应该 post 请求正文中的文本。

Answer 1

文档中有几处内容不清楚。

如果您详细查看提供的示例 here：

端点

您想执行一些 text-to-speech 功能（为您的 Hello world! 文本生成语音），但您正在调用一个 stt（语音到文本）端点说话人识别：

https://westeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1

要使用 tts，端点的格式应与示例相同：

https://westeurope.tts.speech.microsoft.com/cognitiveservices/v1

请求内容

关于您不想使用 SSML 的事实，文档指出：

Text is sent as the body of an HTTP POST request. It can be plain text (ASCII or UTF-8) or Speech Synthesis Markup Language (SSML) format (UTF-8). Plain text requests use the Speech Service's default voice and language. With SSML you can specify the voice and language.

所以我尝试了以下操作：将内容类型从 "application/ssml+xml" 更改为 "text/plain"。但在那种情况下，我得到了：

Error 400 Data at the root level is invalid. Line 1, position 1.

看起来这是解析xml时的一个常见错误，所以看起来这里某处有错误，我在文档中找不到使用 TTS 而没有 ssml 的示例.

有人在页面的反馈部分（在后续步骤 here 下）发布了关于此的问题

蔚蓝 "Text-To-Speech" returns "Invalid CID or language"。这是什么意思？

Azure "Text-To-Speech" returns "Invalid CID or language". What does it mean?

text-to-speech

azure

azure-cognitive-services

端点

请求内容