全部产品
阿里云办公

语音合成REST接口 1.0

更新时间:2018-12-10 10:59:22

本文档为智能语音交互1.0,新用户请使用智能语音交互2.0

1. 简介

语音合成REST API支持以POST方式上传UTF-8编码的合成文本。服务端支持以16bitpcm、alaw编码或16bit wav/pcm文件返回生成语音。语音合成服务的请求地址为:https://nlsapi.aliyun.com/speak

2. 请求格式

  • 语音合成请求实例
  1. POST http://nlsapi.aliyun.com/speak?encode_type=pcm&voice_name=xiaoyun&volume=50
  2. Date: Tue, 21 Mar 2017 11:42:00 GMT
  3. Content-type: text/plain
  4. Authorization: Dataplus *****
  5. Accept: audio/pcm, application/json
  6. Content-Length: 36
  7. 需要被转换的文本

2.1 参数配置

语音合成服务在处理请求时,将在URL地址中检查如下参数。若参数未设置,将以缺省值合成语音。

名称 类型 需求 缺省值 描述
encode_type String 选填 pcm 合成语音的编码格式,支持pcm/wav/mp3/alaw
voice_name String 选填 xiaoyun xiaogang - 男,xiaoyun - 女
volume int 选填 50 0~100
sample_rate int 选填 16000 抽样频率率 8000/16000
speech_rate int 选填 0 语速 -500~500
pitch_rate int 选填 0 语调 -500~500
tts_nus int 选填 1 0 - 通过参数合成语音,1 - 拼接原始录音
background_music_id int 选填 播放语音时可选背景音乐,0,1
background_music_offset int 选填 0 背景音乐播放偏移时长,毫秒。当启用背景音乐时生效
background_music_volume int 选填 50 背景音乐音量,当启用背景音乐时生效,0~100

2.2 语音合成文本

请以UTF-8格式编码后将需要合成的语音文本在POST body中上传,单次请求限制为300个UTF-8字符,即每个汉字、数字、字母都算一个字符。

2.3 HTTP Header

名称 类型 需求 描述
Authorization String 必填 鉴权, 详见 2.3.1
Content-type String 必填 text/plain
Accept String 必填 固定为audio/*, application/json,请将星号替换为所需编码格式,pcm/wav/alaw
Date String 必填 鉴权,HTTP 1.1协议中规定的GMT时间,例如:Wed, 05 Sep. 2012 23:00:00 GMT

2.3.1 Authorization Header

调用阿里巴巴智能语音交互平台的任何功能前都需经过严格的鉴权验证。在处理用户请求前,服务端会校验Authorization Header以确保用户请求在传输过程中没有被恶意篡改或替换。

  1. Authorization: Dataplus access_id:signature

Authorization以固定字符串Dataplus开头,开发者需要将从阿里云申请到的access_id和经过计算的signature以:分隔并以Base64编码后加入Header。

2.3.1.1 signature的计算

阿里云标准校验规范稍有不同,signature的计算需要首先对body内容进行MD5和Base64编码,然后将编码结果与 MethodAcceptContent-TypeDate 合并产生特征值,最后用阿里云取得的access_key对特征值进行HMAC-SHA1加密生成signature。这里和标准方法的区别主要在于拼接特征值时不需要urlpath。

  1. // 1.对body进行MD5+BASE64加密
  2. String bodyMd5 = MD5Base64(body);
  3. // 2.特征值
  4. String feature = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date;
  5. // 2.对特征值HMAC-SHA1加密
  6. String signature = HMACSha1(feature, access_secret);
2.3.1.2 计算 MD5+BASE64
  1. public static String MD5Base64(String s) throws UnsupportedEncodingException {
  2. if (s == null)
  3. return null;
  4. String encodeStr = "";
  5. //string 编码必须为utf-8
  6. byte[] utfBytes = s.getBytes("UTF-8");
  7. MessageDigest mdTemp;
  8. try {
  9. mdTemp = MessageDigest.getInstance("MD5");
  10. mdTemp.update(utfBytes);
  11. byte[] md5Bytes = mdTemp.digest();
  12. BASE64Encoder b64Encoder = new BASE64Encoder();
  13. encodeStr = b64Encoder.encode(md5Bytes);
  14. } catch (Exception e) {
  15. throw new Error("Failed to generate MD5 : " + e.getMessage());
  16. }
  17. return encodeStr;
  18. }
2.3.1.3 计算 HMAC-SHA1
  1. public static String HMACSha1(String data, String key) {
  2. String result;
  3. try {
  4. SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(), "HmacSHA1");
  5. Mac mac = Mac.getInstance("HmacSHA1");
  6. mac.init(signingKey);
  7. byte[] rawHmac = mac.doFinal(data.getBytes());
  8. result = (new BASE64Encoder()).encode(rawHmac);
  9. } catch (Exception e) {
  10. throw new Error("Failed to generate HMAC : " + e.getMessage());
  11. }
  12. return result;
  13. }

2.3.2 Content-Type Header

Content-Type header 规定为text/plain

2.3.3 Accept Header

Accept Header需要允许audio和json两种返回格式。当语音合成成功(包括中途失败),服务器将返回在请求中要求的语音流。请求失败将返回json字符串。

3. 语音合成返回

当语音合成请求成功时,服务端将返回16bit pcm、alaw编码或16bit wav/pcm格式的语音文件。如在合成过程中出现错误,后续语音将不再传输并且没有其他的错误返回。

如果语音合成请求失败,服务端将在response body中返回json格式文件

  1. {
  2. "request_id":"6262b55940044d95afc11d02ddcea377",
  3. "error_code":80103,
  4. "error_message":"authorization failed!"
  5. }

4. 代码示例

4.1 请求DEMO

  1. package com.alibaba.idst.nls;
  2. import java.io.File;
  3. import java.io.IOException;
  4. import java.io.BufferedReader;
  5. import java.io.OutputStream;
  6. import java.io.FileReader;
  7. import java.util.UUID;
  8. import com.alibaba.idst.nls.request.TtsRequest;
  9. import com.alibaba.idst.nls.response.HttpResponse;
  10. import com.alibaba.idst.nls.utils.HttpUtil;
  11. import org.slf4j.Logger;
  12. import org.slf4j.LoggerFactory;
  13. public class HttpTtsDemo {
  14. private static Logger logger = LoggerFactory.getLogger(HttpTtsDemo.class);
  15. private String url = "http://nlsapi.aliyun.com/speak?";
  16. private static String tts_text = "薄雾浓云愁永昼。瑞脑消金兽。佳节又重阳,玉枕纱厨,半夜凉初透。东篱把酒黄昏后。有暗香盈袖。莫道不消魂,帘卷西风,人比黄花瘦。";
  17. public static void main(String[] args) throws IOException {
  18. //请使用https://ak-console.aliyun.com/ 页面获取的Access 信息
  19. //请提前开通智能语音服务(https://data.aliyun.com/product/nls)
  20. String ak_id = args[0];
  21. String ak_secret = args[1];
  22. //设置TTS的参数,详细参数说明详见文档部分2.1 参数配置
  23. HttpTtsDemo ttsDemo=new HttpTtsDemo();
  24. TtsRequest ttsRequest = new TtsRequest();
  25. ttsRequest.setEncodeType("wav");
  26. ttsRequest.setVoiceName("xiaoyun");
  27. ttsRequest.setVolume(50);
  28. ttsRequest.setSampleRate(16000);
  29. ttsRequest.setSpeechRate(0);
  30. ttsRequest.setPitchRate(0);
  31. ttsRequest.setTtsNus(1);
  32. ttsRequest.setBackgroundMusicId(0);
  33. ttsRequest.setBackgroundMusicOffset(0);
  34. ttsRequest.setBackgroundMusicVolume(100);
  35. String url = ttsDemo.url+"encode_type="+ttsRequest.getEncodeType()
  36. +"&voice_name="+ttsRequest.getVoiceName()
  37. +"&volume="+ttsRequest.getVolume()
  38. +"&sample_rate="+ttsRequest.getSampleRate()
  39. +"&speech_rate="+ttsRequest.getSpeechRate()
  40. +"&pitch_rate="+ttsRequest.getPitchRate()
  41. +"&tts_nus="+ttsRequest.getTtsNus()
  42. +"&background_music_id="+ttsRequest.getBackgroundMusicId()
  43. +"&background_music_offset="+ttsRequest.getBackgroundMusicOffset()
  44. +"&background_music_volume="+ttsRequest.getBackgroundMusicVolume();
  45. logger.info("TTS request is: {}",url);
  46. String fileName = UUID.randomUUID().toString().replace("-","");
  47. //tts demo 会在项目根目录生产语音文件
  48. HttpResponse response = HttpUtil.sendTtsPost(tts_text,ttsRequest.getEncodeType(), fileName ,url, ak_id, ak_secret);
  49. }
  50. }

4.2 TTS参数类

  1. package com.alibaba.idst.nls.request;
  2. public class TtsRequest {
  3. private String encode_type;
  4. private String voice_name;
  5. private int volume;
  6. private int sample_rate;
  7. private int speech_rate;
  8. private int pitch_rate;
  9. private int tts_nus;
  10. private int background_music_id;
  11. private int background_music_offset;
  12. private int background_music_volume;
  13. public String getEncodeType() {
  14. return encode_type;
  15. }
  16. public void setEncodeType(String encode_type) {
  17. this.encode_type = encode_type;
  18. }
  19. public String getVoiceName() {
  20. return voice_name;
  21. }
  22. public void setVoiceName(String voice_name) {
  23. this.voice_name = voice_name;
  24. }
  25. public int getVolume() {
  26. return volume;
  27. }
  28. public void setVolume(int volume) {
  29. this.volume = volume;
  30. }
  31. public int getSampleRate() {
  32. return sample_rate;
  33. }
  34. public void setSampleRate(int sample_rate) {
  35. this.sample_rate = sample_rate;
  36. }
  37. public int getSpeechRate() {
  38. return speech_rate;
  39. }
  40. public void setSpeechRate(int speech_rate) {
  41. this.speech_rate = speech_rate;
  42. }
  43. public int getPitchRate() {
  44. return pitch_rate;
  45. }
  46. public void setPitchRate(int pitch_rate) {
  47. this.pitch_rate = pitch_rate;
  48. }
  49. public int getTtsNus() {
  50. return tts_nus;
  51. }
  52. public void setTtsNus(int tts_nus) {
  53. this.tts_nus = tts_nus;
  54. }
  55. public int getBackgroundMusicId() {
  56. return background_music_id;
  57. }
  58. public void setBackgroundMusicId(int background_music_id) {
  59. this.background_music_id = background_music_id;
  60. }
  61. public int getBackgroundMusicOffset() {
  62. return background_music_offset;
  63. }
  64. public void setBackgroundMusicOffset(int background_music_offset) {
  65. this.background_music_offset = background_music_offset;
  66. }
  67. public int getBackgroundMusicVolume() {
  68. return background_music_volume;
  69. }
  70. public void setBackgroundMusicVolume(int background_music_volume) {
  71. this.background_music_volume = background_music_volume;
  72. }
  73. }

4.3 请求服务的HttpUtil类

  1. package com.alibaba.idst.nls.utils;
  2. import java.io.*;
  3. import java.net.HttpURLConnection;
  4. import java.net.URL;
  5. import java.security.MessageDigest;
  6. import java.text.SimpleDateFormat;
  7. import java.util.*;
  8. import javax.crypto.spec.SecretKeySpec;
  9. import com.alibaba.idst.nls.response.HttpResponse;
  10. import javax.crypto.Mac;
  11. import org.slf4j.Logger;
  12. import org.slf4j.LoggerFactory;
  13. import sun.misc.BASE64Encoder;
  14. public class HttpUtil {
  15. static Logger logger = LoggerFactory.getLogger(HttpUtil.class);
  16. /*
  17. * 计算MD5+BASE64
  18. */
  19. public static String MD5Base64(byte[] s) throws UnsupportedEncodingException {
  20. if (s == null) {
  21. return null;
  22. }
  23. String encodeStr = "";
  24. // string 编码必须为utf-8
  25. MessageDigest mdTemp;
  26. try {
  27. mdTemp = MessageDigest.getInstance("MD5");
  28. mdTemp.update(s);
  29. byte[] md5Bytes = mdTemp.digest();
  30. BASE64Encoder b64Encoder = new BASE64Encoder();
  31. encodeStr = b64Encoder.encode(md5Bytes);
  32. /*
  33. * java 1.8以上版本支持 Encoder encoder = Base64.getEncoder(); encodeStr =
  34. * encoder.encodeToString(md5Bytes);
  35. */
  36. } catch (Exception e) {
  37. throw new Error("Failed to generate MD5 : " + e.getMessage());
  38. }
  39. return encodeStr;
  40. }
  41. /*
  42. * 计算 HMAC-SHA1
  43. */
  44. public static String HMACSha1(String data, String key) {
  45. String result;
  46. try {
  47. SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(), "HmacSHA1");
  48. Mac mac = Mac.getInstance("HmacSHA1");
  49. mac.init(signingKey);
  50. byte[] rawHmac = mac.doFinal(data.getBytes());
  51. result = (new BASE64Encoder()).encode(rawHmac);
  52. /*
  53. * java 1.8以上版本支持 Encoder encoder = Base64.getEncoder(); result =
  54. * encoder.encodeToString(rawHmac);
  55. */
  56. } catch (Exception e) {
  57. throw new Error("Failed to generate HMAC : " + e.getMessage());
  58. }
  59. return result;
  60. }
  61. /*
  62. * 等同于javaScript中的 new Date().toUTCString();
  63. */
  64. public static String toGMTString(Date date) {
  65. SimpleDateFormat df = new SimpleDateFormat("E, dd MMM yyyy HH:mm:ss z", Locale.UK);
  66. df.setTimeZone(new java.util.SimpleTimeZone(0, "GMT"));
  67. return df.format(date);
  68. }
  69. /*
  70. * 发送POST请求
  71. */
  72. public static HttpResponse sendAsrPost(byte[] audioData, String audioFormat, int sampleRate, String url,
  73. String ak_id, String ak_secret) {
  74. BufferedReader in = null;
  75. String result = "";
  76. HttpResponse response = new HttpResponse();
  77. try {
  78. URL realUrl = new URL(url);
  79. /*
  80. * http header 参数
  81. */
  82. String method = "POST";
  83. String accept = "application/json";
  84. String content_type = "audio/" + audioFormat + ";samplerate=" + sampleRate;
  85. int length = audioData.length;
  86. String date = toGMTString(new Date());
  87. // 1.对body做MD5+BASE64加密
  88. String bodyMd5 = MD5Base64(audioData);
  89. String md52 = MD5Base64(bodyMd5.getBytes());
  90. String stringToSign = method + "\n" + accept + "\n" + md52 + "\n" + content_type + "\n" + date;
  91. // 2.计算 HMAC-SHA1
  92. String signature = HMACSha1(stringToSign, ak_secret);
  93. // 3.得到 authorization header
  94. String authHeader = "Dataplus " + ak_id + ":" + signature;
  95. // 打开和URL之间的连接
  96. HttpURLConnection conn = (HttpURLConnection) realUrl.openConnection();
  97. // 设置通用的请求属性
  98. conn.setRequestProperty("accept", accept);
  99. conn.setRequestProperty("content-type", content_type);
  100. conn.setRequestProperty("date", date);
  101. conn.setRequestProperty("Authorization", authHeader);
  102. conn.setRequestProperty("Content-Length", String.valueOf(length));
  103. // 发送POST请求必须设置如下两行
  104. conn.setDoOutput(true);
  105. conn.setDoInput(true);
  106. // 获取URLConnection对象对应的输出流
  107. OutputStream stream = conn.getOutputStream();
  108. // 发送请求参数
  109. stream.write(audioData);
  110. // flush输出流的缓冲
  111. stream.flush();
  112. stream.close();
  113. response.setStatus(conn.getResponseCode());
  114. // 定义BufferedReader输入流来读取URL的响应
  115. if (response.getStatus() == 200) {
  116. in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
  117. } else {
  118. in = new BufferedReader(new InputStreamReader(conn.getErrorStream()));
  119. }
  120. String line;
  121. while ((line = in.readLine()) != null) {
  122. result += line;
  123. }
  124. if (response.getStatus() == 200) {
  125. response.setResult(result);
  126. response.setMessage("OK");
  127. } else {
  128. response.setMessage(result);
  129. }
  130. System.out.println("post response status code: [" + response.getStatus() + "], response massage : ["
  131. + response.getMessage() + "] ,result :[" + response.getResult() + "]");
  132. } catch (Exception e) {
  133. System.out.println("发送 POST 请求出现异常!" + e);
  134. e.printStackTrace();
  135. }
  136. // 使用finally块来关闭输出流、输入流
  137. finally {
  138. try {
  139. if (in != null) {
  140. in.close();
  141. }
  142. } catch (IOException ex) {
  143. ex.printStackTrace();
  144. }
  145. }
  146. return response;
  147. }
  148. /*
  149. * 发送POST请求
  150. */
  151. public static HttpResponse sendTtsPost(String textData, String audioType, String audioName, String url,
  152. String ak_id, String ak_secret) {
  153. BufferedReader in = null;
  154. FileOutputStream fileOutputStream = null;
  155. String result = "";
  156. HttpResponse response = new HttpResponse();
  157. try {
  158. URL realUrl = new URL(url);
  159. /*
  160. * http header 参数
  161. */
  162. String method = "POST";
  163. String content_type = "text/plain";
  164. String accept = "audio/" + audioType + ",application/json";
  165. int length = textData.length();
  166. String date = toGMTString(new Date());
  167. // 1.对body做MD5+BASE64加密
  168. String bodyMd5 = MD5Base64(textData.getBytes("UTF-8"));
  169. String stringToSign = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date;
  170. // 2.计算 HMAC-SHA1
  171. String signature = HMACSha1(stringToSign, ak_secret);
  172. // 3.得到 authorization header
  173. String authHeader = "Dataplus " + ak_id + ":" + signature;
  174. // 打开和URL之间的连接
  175. HttpURLConnection conn = (HttpURLConnection) realUrl.openConnection();
  176. // 设置通用的请求属性
  177. conn.setRequestProperty("accept", accept);
  178. conn.setRequestProperty("content-type", content_type);
  179. conn.setRequestProperty("date", date);
  180. conn.setRequestProperty("Authorization", authHeader);
  181. conn.setRequestProperty("Content-Length", String.valueOf(length));
  182. // 发送POST请求必须设置如下两行
  183. conn.setDoOutput(true);
  184. conn.setDoInput(true);
  185. // 获取URLConnection对象对应的输出流
  186. OutputStream stream = conn.getOutputStream();
  187. // 发送请求参数
  188. stream.write(textData.getBytes("UTF-8"));
  189. // flush输出流的缓冲
  190. stream.flush();
  191. stream.close();
  192. response.setStatus(conn.getResponseCode());
  193. // 定义BufferedReader输入流来读取URL的响应
  194. InputStream is = null;
  195. String line = null;
  196. if (response.getStatus() == 200) {
  197. is = conn.getInputStream();
  198. } else {
  199. in = new BufferedReader(new InputStreamReader(conn.getErrorStream()));
  200. while ((line = in.readLine()) != null) {
  201. result += line;
  202. }
  203. }
  204. File ttsFile = new File(audioName + "." + audioType);
  205. fileOutputStream = new FileOutputStream(ttsFile);
  206. byte[] b = new byte[1024];
  207. int len = 0;
  208. while (is != null && (len = is.read(b)) != -1) { // 先读到内存
  209. fileOutputStream.write(b, 0, len);
  210. }
  211. if (response.getStatus() == 200) {
  212. response.setResult(result);
  213. response.setMessage("OK");
  214. System.out.println("post response status code: [" + response.getStatus()
  215. + "], generate tts audio file :" + audioName + "." + audioType);
  216. } else {
  217. response.setMessage(result);
  218. System.out.println("post response status code: [" + response.getStatus() + "], response massage : ["
  219. + response.getMessage() + "]");
  220. }
  221. } catch (Exception e) {
  222. System.out.println("发送 POST 请求出现异常!" + e);
  223. e.printStackTrace();
  224. }
  225. // 使用finally块来关闭输出流、输入流
  226. finally {
  227. try {
  228. if (fileOutputStream != null) {
  229. fileOutputStream.close();
  230. }
  231. if (in != null) {
  232. in.close();
  233. }
  234. } catch (IOException ex) {
  235. ex.printStackTrace();
  236. }
  237. }
  238. return response;
  239. }
  240. }

4.4 请求结果类

  1. package com.alibaba.idst.nls.response;
  2. public class HttpResponse {
  3. private int status;
  4. private String result;
  5. private String message;
  6. public int getStatus() {
  7. return status;
  8. }
  9. public void setStatus(int status) {
  10. this.status = status;
  11. }
  12. public String getResult() {
  13. return result;
  14. }
  15. public void setResult(String result) {
  16. this.result = result;
  17. }
  18. public String getMessage() {
  19. return message;
  20. }
  21. public void setMessage(String message) {
  22. this.message = message;
  23. }
  24. }