使用 AWS Polly 合成超过 1500 个字符？

Question

我的想法是使用 AWS Polly 从 RSS 提要中大声朗读一些新闻。根据 this link，我了解到 Polly 在要转换的字符方面非常灵活，因为示例之一是 "Adventures of Huckelberry Finn" by Mark Twain ~600k characters 问题是当我尝试将文章转换为语音时出现以下错误：

An error occurred (TextLengthExceededException) when calling the SynthesizeSpeech operation: Maximum text length has been exceeded

我尝试转换的文本大约有 5000 个字符。

有什么方法（有或没有 API）用 Polly 转换长文本字符串，而不必将它们切成数百万个不同的片段？

任何正确方向的提示都将不胜感激，

谢谢

Answer 1

The size of the input text can be up to 1500 billed characters (3000 total characters). SSML tags are not counted as billed characters.

http://docs.aws.amazon.com/polly/latest/dg/limits.html

pricing examples seem to be intended to give a sense of the relatively low cost of voicing a large work, but the work would actually need to be divided into groups of sentences and submitted to the API, which is the only interface -- the SDKs and CLI call the same SynthesizeSpeech API.

Answer 2

如果不将文本分成几部分，我没有特别的提示，但我写了一篇文章，介绍了在 NodeJS 中执行此操作的方法。如果您没有其他选择，请随时查看并发表评论！

How to handle more than 1500 characters with AWS Polly text-to-speech

Answer 3

我相信您已经找到了这个问题的答案或现在继续前进。但我想在未来帮助任何人解决这个问题。

我在使用 AWS Polly 时遇到了同样的问题，不允许我一次发送超过 1500 个字符。所以我写了一些 javascript 来帮助将文本分成 230 个单词块，然后一个接一个地发送到 API 然后将所有 mp3 文件拼接在一起，然后再缓冲并播放它。

这是我的 Github： https://github.com/Aaronbest94/Polly-Character-Limitations

它不是最优雅的 Javascript 但它确实有效，我希望它能对以后阅读本文的任何人有所帮助。

Answer 4

文档中描述了如何创建长音频文件：https://docs.aws.amazon.com/polly/latest/dg/longer-cli.html

aws-CLI 调用可能如下所示：

aws polly start-speech-synthesis-task \
--region eu-central-1 \
--endpoint-url "https://polly.eu-central-1.amazonaws.com/" \
--output-format mp3 \
--output-s3-bucket-name your-bucket-name \
--output-s3-key-prefix optional/prefix/path/file \
--voice-id Hans \
--text-type ssml \
--text file://output.xml \
--speech-mark-types='["sentence", "word", "ssml"]' \

如您所见，您将需要一个 S3 存储桶来进行（临时）存储。

使用 AWS Polly 合成超过 1500 个字符？

Synthesize more than 1500 characters using AWS Polly?

amazon-web-services

amazon-polly