Bing 文字转语音在 Android 中不起作用
Bing text to speech not working in Android
所以我克隆了示例 Cognitive-Speech-TTS 并测试了 Android TTS,但没有用,我没有听到任何声音 result/voice 我已经完成了必要的要求,比如 obatian a API 订阅密钥并管理它。所以这里 Logcat 结果
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample D/Authentication: new Access Token: ******************
com.microsoft.sdksample D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
com.microsoft.sdksample D/Atlas: Validating map...
com.microsoft.sdksample I/Adreno-EGL: <qeglDrvAPI_eglInitialize:410>: EGL 1.4 QUALCOMM build: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030_msm8226_LA.BF.1.1.1_RB1__release_AU ()
OpenGL ES Shader Compiler Version: E031.25.03.06
Build Date: 06/10/15 Wed
Local Branch:
Remote Branch: quic/LA.BF.1.1.1_rb1.24
Local Patches: NONE
Reconstruct Branch: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030 + 6151be1 + NOTHING
com.microsoft.sdksample I/OpenGLRenderer: Initialized EGL, version 1.4
com.microsoft.sdksample D/OpenGLRenderer: Enabling debug mode 0
com.microsoft.sdksample I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@166d0b8f time:2397803
我通过使用 XmlDom class 获取 SSML
解决了这个问题
String body = XmlDom.createDom(deviceLanguage, genderName, voiceName, "Your text here");
字节[] xml字节 = body.getBytes();
urlConnection.setRequestProperty("content-length", String.valueOf(xmlBytes.length));
public class XmlDom {
public static String createDom(String locale, String genderName, String voiceName, String textToSynthesize){
Document doc = null;
Element speak, voice;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbf.newDocumentBuilder();
doc = builder.newDocument();
if (doc != null){
speak = doc.createElement("speak");
speak.setAttribute("version", "1.0");
speak.setAttribute("xml:lang", "en-us");
voice = doc.createElement("voice");
voice.setAttribute("xml:lang", locale);
voice.setAttribute("xml:gender", genderName);
voice.setAttribute("name", voiceName); voice.appendChild(doc.createTextNode(textToSynthesize));
speak.appendChild(voice);
doc.appendChild(speak);
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return transformDom(doc);
}
private static String transformDom(Document doc){
StringWriter writer = new StringWriter();
try {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer;
transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(doc), new StreamResult(writer));
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return writer.getBuffer().toString().replaceAll("\n|\r", "");
}
}
更新:
使用XmlDomclass获取SSML后,发现SSML中需要指定xml:lang='YOU_LANGUAGE_HERE'语音标签。例如
<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>This is a demo of Microsoft Cognitive Services Text to Speech API.</voice></speak>
所以我克隆了示例 Cognitive-Speech-TTS 并测试了 Android TTS,但没有用,我没有听到任何声音 result/voice 我已经完成了必要的要求,比如 obatian a API 订阅密钥并管理它。所以这里 Logcat 结果
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample D/Authentication: new Access Token: ******************
com.microsoft.sdksample D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
com.microsoft.sdksample D/Atlas: Validating map...
com.microsoft.sdksample I/Adreno-EGL: <qeglDrvAPI_eglInitialize:410>: EGL 1.4 QUALCOMM build: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030_msm8226_LA.BF.1.1.1_RB1__release_AU ()
OpenGL ES Shader Compiler Version: E031.25.03.06
Build Date: 06/10/15 Wed
Local Branch:
Remote Branch: quic/LA.BF.1.1.1_rb1.24
Local Patches: NONE
Reconstruct Branch: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030 + 6151be1 + NOTHING
com.microsoft.sdksample I/OpenGLRenderer: Initialized EGL, version 1.4
com.microsoft.sdksample D/OpenGLRenderer: Enabling debug mode 0
com.microsoft.sdksample I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@166d0b8f time:2397803
我通过使用 XmlDom class 获取 SSML
解决了这个问题String body = XmlDom.createDom(deviceLanguage, genderName, voiceName, "Your text here");
字节[] xml字节 = body.getBytes();
urlConnection.setRequestProperty("content-length", String.valueOf(xmlBytes.length));
public class XmlDom {
public static String createDom(String locale, String genderName, String voiceName, String textToSynthesize){
Document doc = null;
Element speak, voice;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbf.newDocumentBuilder();
doc = builder.newDocument();
if (doc != null){
speak = doc.createElement("speak");
speak.setAttribute("version", "1.0");
speak.setAttribute("xml:lang", "en-us");
voice = doc.createElement("voice");
voice.setAttribute("xml:lang", locale);
voice.setAttribute("xml:gender", genderName);
voice.setAttribute("name", voiceName); voice.appendChild(doc.createTextNode(textToSynthesize));
speak.appendChild(voice);
doc.appendChild(speak);
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return transformDom(doc);
}
private static String transformDom(Document doc){
StringWriter writer = new StringWriter();
try {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer;
transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(doc), new StreamResult(writer));
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return writer.getBuffer().toString().replaceAll("\n|\r", "");
}
}
更新:
使用XmlDomclass获取SSML后,发现SSML中需要指定xml:lang='YOU_LANGUAGE_HERE'语音标签。例如
<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>This is a demo of Microsoft Cognitive Services Text to Speech API.</voice></speak>