合成后无法播放 IBM Watson 文本到语音音频文件

IBM Watson text to speech audio file not being able to be played after synthesizing

我正在做的是写入音频输出文件,等到文件存在并且大小不为 0,然后播放它(我尝试了很多不同的库,例如 subprocess、playsound、pygame、vlc 等。我也尝试过许多不同的文件类型(mp3、wav 等),但出于某种原因,我收到一条错误消息,指出它没有关闭或已损坏。偶尔它会播放一次,但一旦播放另一个 watson 制作的 mp3,它就会再次出错。有人知道解决方案吗?

...
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
...
authenticator = IAMAuthenticator(ibmApiKey);
textToSpeech = TextToSpeechV1(authenticator = authenticator);
textToSpeech.set_service_url(ibmServiceUrl);
...
file = str(int(random.random() * 100000)) + ".mp3";
    with open(file, "wb") as audioFile:
        audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);

    fileExists = False;

    while (fileExists == False):
        if (os.path.isfile(file)):
            fileExists = os.stat(file).st_size != 0;
            playsound(file);
            os.remove(file);
Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

    Error 263 for command:
        close temp/77451.mp3
    The specified device is not open or is not recognized by MCI.
Failed to close the file: temp/77451.mp3
Traceback (most recent call last):
  File "main.py", line 457, in <module>
    runMain(name, config.get("main", "callName"), voice);
  File "main.py", line 156, in runMain
    speak("The time is: " + datetime.now().strptime(datetime.now().time().strftime("%H:%M"), "%H:%M").strftime("%I:%M %p"), voice);
  File "main.py", line 123, in speak
    playsound(file);
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 72, in _playsoundWin
    winCommand(u'open {}'.format(sound))
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 64, in winCommand
    raise PlaysoundException(exceptionMessage)
playsound.PlaysoundException:
    Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

错误可能存在于多个地方。

首先我会尝试这个:

from ibm_watson import ApiException
try:
    file = str(int(random.random() * 100000)) + ".mp3";
        with open(file, "wb") as audioFile:
            audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);
except ApiException as ex:
    print ("Method failed with status code " + str(ex.code) + ": " + ex.message)

如果调用 Watson returns 出错,它可能会将您从运行时中弹出。

但是,如果问题出在 playsound 上,我会建议这条路线:

import pyttsx3
from ibm_watson import ApiException

engine = pyttsx3.init()
try:
    file = str(int(random.random() * 100000)) + ".mp3";
        with open(file, "wb") as audioFile:
            audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);

        fileExists = False;

        while (fileExists == False):
            if (os.path.isfile(file)):
                fileExists = os.stat(file).st_size != 0;
                engine.say(file);
                os.remove(file); 
                engine.runAndWait()          

except ApiException as ex:
    print ("Method failed with status code " + str(ex.code) + ": " + ex.message)

如果这些都不起作用,我会尝试使用 curl,看看您是否可以复制您的场景:

Replace {apikey} and {url} with your API key and URL.


curl -X POST -u "apikey:{apikey}" --header "Content-Type: application/json" --data "{\"text\":\"hello world\"}" --output hello_world.ogg "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"

祝你好运。