使用 Google 的文本转语音 API 一次执行多个请求时仅获取最后一个请求的音频

Question

一次执行多个请求时，使用Promise.all，我似乎只得到最后一个解析请求的audioContent。

我正在合成大文本，需要使用 API 的字符限制将其拆分。

我以前有这个工作，所以我知道它应该工作，但最近停止工作。

我正在对 Amazon 的 Polly 进行完全相同的操作，并且可以正常工作。它是完全相同的代码，但具有不同的客户端和不同的请求选项。

所以我觉得这可能是图书馆的事？还是 Google 服务问题？

我使用的是最新版本：https://github.com/googleapis/nodejs-text-to-speech

export const googleSsmlToSpeech = async (
  index: number,
  ssmlPart: string,
  type: SynthesizerType,
  identifier: string,
  synthesizerOptions: GoogleSynthesizerOptions,
  storageUploadPath: string
) => {
  let extension = 'mp3';

  if (synthesizerOptions.audioConfig.audioEncoding === 'OGG_OPUS') {
    extension = 'opus';
  }

  if (synthesizerOptions.audioConfig.audioEncoding === 'LINEAR16') {
    extension = 'wav';
  }

  synthesizerOptions.input.ssml = ssmlPart;

  const tempLocalAudiofilePath = `${appRootPath}/temp/${storageUploadPath}-${index}.${extension}`;

  try {
    // Make sure the path exists, if not, we create it
    await fsExtra.ensureFile(tempLocalAudiofilePath);

      // Performs the Text-to-Speech request
    const [response] = await client.synthesizeSpeech(synthesizerOptions);

    // Write the binary audio content to a local file
    await fsExtra.writeFile(tempLocalAudiofilePath, response.audioContent, 'binary');

    return tempLocalAudiofilePath;
  } catch (err) {
    throw err;
  }
};

/**
 * Synthesizes the SSML parts into seperate audiofiles
 */
export const googleSsmlPartsToSpeech = async (
  ssmlParts: string[],
  type: SynthesizerType,
  identifier: string,
  synthesizerOptions: GoogleSynthesizerOptions,
  storageUploadPath: string
) => {
  const promises: Promise<string>[] = [];

  ssmlParts.forEach((ssmlPart: string, index: number) => {
    promises.push(googleSsmlToSpeech(index, ssmlPart, type, identifier, synthesizerOptions, storageUploadPath));
  });

  const tempAudioFiles = await Promise.all(promises);

  tempAudioFiles.sort((a: any, b: any) => b - a); // Sort: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 etc...

  return tempAudioFiles;
};

以上代码创建了多个具有正确命名和索引号的文件，但是，它们都包含相同的音频。那是;解决最快的音频响应。

824163ed-b4d9-4830-99da-6e6f985727e2-0.mp3
824163ed-b4d9-4830-99da-6e6f985727e2-1.mp3
824163ed-b4d9-4830-99da-6e6f985727e2-2.mp3

用一个简单的 for 循环替换 Promise.all，使其工作。但这需要更长的时间，因为它等待每个请求都得到解决。我知道 Promise.all 可以工作，因为我以前让它工作过，并且希望看到它再次工作。

  const tempAudioFiles = [];
  for (var i = 0; i < ssmlParts.length; i++) {
    tempAudioFiles[i] = await googleSsmlToSpeech(i, ssmlParts[i], type, identifier, synthesizerOptions, storageUploadPath);
  }

我似乎无法再使用 Promise.all 让它工作了。

Answer 1

成功了。图书馆做事的方式似乎与我想象的不同。使用 Object.assign 创建 synthesizerOptions 的副本就成功了

工作代码：https://github.com/googleapis/nodejs-text-to-speech/issues/210#issuecomment-487832411

ssmlParts.forEach((ssmlPart: string, index: number) => {
  const synthesizerOptionsCopy = Object.assign({}, synthesizerOptions);
  promises.push(googleSsmlToSpeech(index, ssmlPart, type, identifier, synthesizerOptionsCopy, storageUploadPath));
});

// Inside googleSsmlToSpeech()
const ssmlPartSynthesizerOptions = Object.assign(synthesizerOptions, {
  input: {
    ssml: ssmlPart
  }
});

使用 Google 的文本转语音 API 一次执行多个请求时仅获取最后一个请求的音频

Only getting the audio of the last request when doing multiple requests at once using Google's Text to Speech API

text-to-speech

ssml

node.js

google-text-to-speech

google-cloud-platform