全部产品
存储与CDN 数据库 安全 应用服务 数加·人工智能 数加·大数据基础服务 互联网中间件 视频服务 开发者工具 解决方案 物联网

用户自定义热词

更新时间:2017-08-07 10:59:39

注意:本文档的API定义省略了REST接口调用约定中定义的通用字段示例中的curl命令仅用来描述请求参数,实际使用必需带有数加验证信息

1. 功能介绍

用户使用智能语音交互服务的时,在某些特定领域可能会出现特定关键词识别不准的情形,这时用户可以通过提交自定义热词的方式来提高语音识别率。

阿里云智能语音识别服务提供用户上传自定义热词的RESTful接口。

2. 创建词表

注意: 每个用户可以创建最多100个定制词表

地址

POST /asr/custom/vocabs

请求

参数 参数类型 值类型 描述
body Body Object Vocabulary对象

请求-Vocabulary对象

参数 类型 描述
global_weight Integer 整个词表的全局权重,默认值1
words String[] 需要定制的词表,允许最多128个词,与word_weights至少存在一项
word_weights Map 可通过此参数为每个词设置不同的权重,与words至少存在一项

响应

参数 类型 描述
vocabulary_id String 已创建定制词表的ID

示例-请求

  1. curl -X POST --data "{\"words\":[\"苹果\",\"西瓜\",\"桃子\"]}" \
  2. "https://nlsapi.aliyun.com/asr/custom/vocabs"

指定全局权重:定制词表整体上效果较弱(应该识别出来的定制词没有识别出来),或过强(某些别的词被误识为定制词)

  1. {
  2. "global_weight": 2,
  3. "words": [
  4. "苹果",
  5. "西瓜",
  6. "桃子"
  7. ]
  8. }

指定特殊词权重:整个词表效果较好,但是个别词效果较弱,或过强

  1. {
  2. "words": [
  3. "苹果",
  4. "西瓜",
  5. "桃子",
  6. "香瓜",
  7. ],
  8. "word_weights": {
  9. "香瓜": 2
  10. }
  11. }

为每个词使用不同权重:如果有足够多的经验,可以设置每个词不同的weight

  1. {
  2. "word_weights": {
  3. "苹果": 1,
  4. "西瓜": 1,
  5. "桃子": 2,
  6. "香瓜": 2
  7. }
  8. }

3. 查询词表

地址

GET /asr/custom/vocabs/{vocabulary_id}

请求

参数 参数类型 值类型 描述
vocabulary_id Path String 定制词表ID

响应

Vocabulary对象

示例-请求

  1. curl -X GET "https://nlsapi.aliyun.com/asr/custom/vocabs/be053bf9af0e406dafa8249631372d53"

示例-响应

  1. {
  2. "global_weight": 2,
  3. "words": [
  4. "苹果",
  5. "西瓜",
  6. "桃子"
  7. ]
  8. }

4. 更新词表

地址

PUT /asr/custom/vocabs/{vocabulary_id}

请求

参数 参数类型 值类型 描述
vocabulary_id Path String 定制词表ID

响应

示例-请求

  1. curl -X PUT --data "{\"words\":[\"苹果\",\"西瓜\",\"桃子\"]}" \
  2. "https://nlsapi.aliyun.com/asr/custom/vocabs/be053bf9af0e406dafa8249631372d53"

5. 删除词表

地址

DELETE /asr/custom/vocabs/{vocabulary_id}

请求

参数 参数类型 值类型 描述
vocabulary_id Path String 定制词表ID

响应

示例-请求

  1. curl -X DELETE "https://nlsapi.aliyun.com/asr/custom/vocabs/be053bf9af0e406dafa8249631372d53"

6. Java示例代码

  1. package com.alibaba.idst.nls.utils;
  2. import java.io.BufferedReader;
  3. import java.io.ByteArrayOutputStream;
  4. import java.io.IOException;
  5. import java.io.InputStream;
  6. import java.io.InputStreamReader;
  7. import java.io.OutputStreamWriter;
  8. import java.io.Writer;
  9. import java.io.UnsupportedEncodingException;
  10. import java.net.HttpURLConnection;
  11. import java.net.URL;
  12. import java.security.MessageDigest;
  13. import java.text.SimpleDateFormat;
  14. import java.util.Date;
  15. import java.util.Locale;
  16. import javax.crypto.Mac;
  17. import javax.crypto.spec.SecretKeySpec;
  18. import com.alibaba.fastjson.JSONPath;
  19. import sun.misc.BASE64Encoder;
  20. /**
  21. * Created by zhishen on 2017/7/10.
  22. */
  23. public class HttpUtil {
  24. /*
  25. * 计算MD5+BASE64
  26. */
  27. public static String MD5Base64(String s) throws UnsupportedEncodingException {
  28. if (s == null)
  29. return null;
  30. String encodeStr = "";
  31. //string 编码必须为utf-8
  32. byte[] utfBytes = s.getBytes("UTF-8");
  33. MessageDigest mdTemp;
  34. try {
  35. mdTemp = MessageDigest.getInstance("MD5");
  36. mdTemp.update(utfBytes);
  37. byte[] md5Bytes = mdTemp.digest();
  38. BASE64Encoder b64Encoder = new BASE64Encoder();
  39. encodeStr = b64Encoder.encode(md5Bytes);
  40. } catch (Exception e) {
  41. throw new Error("Failed to generate MD5 : " + e.getMessage());
  42. }
  43. return encodeStr;
  44. }
  45. /*
  46. * 计算 HMAC-SHA1
  47. */
  48. public static String HMACSha1(String data, String key) {
  49. String result;
  50. try {
  51. SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(), "HmacSHA1");
  52. Mac mac = Mac.getInstance("HmacSHA1");
  53. mac.init(signingKey);
  54. byte[] rawHmac = mac.doFinal(data.getBytes());
  55. result = (new BASE64Encoder()).encode(rawHmac);
  56. } catch (Exception e) {
  57. throw new Error("Failed to generate HMAC : " + e.getMessage());
  58. }
  59. return result;
  60. }
  61. /*
  62. * 等同于javaScript中的 new Date().toUTCString();
  63. */
  64. public static String toGMTString(Date date) {
  65. SimpleDateFormat df = new SimpleDateFormat("E, dd MMM yyyy HH:mm:ss z", Locale.UK);
  66. df.setTimeZone(new java.util.SimpleTimeZone(0, "GMT"));
  67. return df.format(date);
  68. }
  69. /*
  70. * 发送POST请求
  71. */
  72. public static String sendPost(String url, String body, String ak_id, String ak_secret) {
  73. HttpURLConnection conn = null;
  74. Writer out = null;
  75. InputStream in = null;
  76. String result = "";
  77. try {
  78. URL realUrl = new URL(url);
  79. /*
  80. * http header 参数
  81. */
  82. String method = "POST";
  83. String accept = "application/json";
  84. String content_type = "application/json";
  85. String path = realUrl.getFile();
  86. String date = toGMTString(new Date());
  87. // 1.对body做MD5+BASE64加密
  88. String bodyMd5 = MD5Base64(body);
  89. String stringToSign = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date ;
  90. // 2.计算 HMAC-SHA1
  91. String signature = HMACSha1(stringToSign, ak_secret);
  92. // 3.得到 authorization header
  93. String authHeader = "Dataplus " + ak_id + ":" + signature;
  94. // 打开和URL之间的连接
  95. conn = (HttpURLConnection) realUrl.openConnection();
  96. // 设置通用的请求属性
  97. conn.setRequestMethod(method);
  98. conn.setRequestProperty("accept", accept);
  99. conn.setRequestProperty("content-type", content_type);
  100. conn.setRequestProperty("Accept-Charset", "utf-8");
  101. conn.setRequestProperty("contentType", "utf-8");
  102. conn.setRequestProperty("date", date);
  103. conn.setRequestProperty("Authorization", authHeader);
  104. // 发送POST请求必须设置如下两行
  105. conn.setDoOutput(true);
  106. conn.setDoInput(true);
  107. // 获取URLConnection对象对应的输出流
  108. out = new OutputStreamWriter(conn.getOutputStream(), "utf-8");
  109. // 发送请求参数
  110. out.write(body);
  111. // flush输出流的缓冲
  112. out.flush();
  113. int rc = conn.getResponseCode();
  114. if (rc == 200) {
  115. in = conn.getInputStream();
  116. } else {
  117. in = conn.getErrorStream();
  118. }
  119. result = changeInputStream(in, "UTF-8");
  120. } catch (Exception e) {
  121. e.printStackTrace();
  122. }
  123. // 使用finally块来关闭输出流、输入流
  124. finally {
  125. try {
  126. if (conn != null) {
  127. conn.disconnect();
  128. }
  129. if (out != null) {
  130. out.close();
  131. }
  132. if (in != null) {
  133. in.close();
  134. }
  135. } catch (IOException ex) {
  136. ex.printStackTrace();
  137. }
  138. }
  139. return result;
  140. }
  141. /*
  142. * 发送PUT请求
  143. */
  144. public static String sendPut(String url, String body, String ak_id, String ak_secret) {
  145. HttpURLConnection conn = null;
  146. PrintWriter out = null;
  147. InputStream in = null;
  148. String result = "";
  149. try {
  150. URL realUrl = new URL(url);
  151. /*
  152. * http header 参数
  153. */
  154. String method = "PUT";
  155. String accept = "application/json";
  156. String content_type = "application/json";
  157. String date = toGMTString(new Date());
  158. // 1.对body做MD5+BASE64加密
  159. String bodyMd5 = MD5Base64(body);
  160. String stringToSign = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date ;
  161. // 2.计算 HMAC-SHA1
  162. String signature = HMACSha1(stringToSign, ak_secret);
  163. // 3.得到 authorization header
  164. String authHeader = "Dataplus " + ak_id + ":" + signature;
  165. // 打开和URL之间的连接
  166. conn = (HttpURLConnection) realUrl.openConnection();
  167. // 设置通用的请求属性
  168. conn.setRequestMethod(method);
  169. conn.setRequestProperty("accept", accept);
  170. conn.setRequestProperty("content-type", content_type);
  171. conn.setRequestProperty("Accept-Charset", "utf-8");
  172. conn.setRequestProperty("date", date);
  173. conn.setRequestProperty("Authorization", authHeader);
  174. // 发送POST请求必须设置如下两行
  175. conn.setDoOutput(true);
  176. conn.setDoInput(true);
  177. // 获取URLConnection对象对应的输出流
  178. out = new PrintWriter(conn.getOutputStream());
  179. // 发送请求参数
  180. out.print(body);
  181. // flush输出流的缓冲
  182. out.flush();
  183. int rc = conn.getResponseCode();
  184. if (rc == 200) {
  185. in = conn.getInputStream();
  186. } else {
  187. in = conn.getErrorStream();
  188. }
  189. result = changeInputStream(in, "UTF-8");
  190. } catch (Exception e) {
  191. e.printStackTrace();
  192. }
  193. // 使用finally块来关闭输出流、输入流
  194. finally {
  195. try {
  196. if (conn != null) {
  197. conn.disconnect();
  198. }
  199. if (out != null) {
  200. out.close();
  201. }
  202. if (in != null) {
  203. in.close();
  204. }
  205. } catch (IOException ex) {
  206. ex.printStackTrace();
  207. }
  208. }
  209. return result;
  210. }
  211. /*
  212. * 发送DELETE请求
  213. */
  214. public static String sendDelete(String url, String ak_id, String ak_secret) {
  215. HttpURLConnection conn = null;
  216. PrintWriter out = null;
  217. InputStream in = null;
  218. String result = "";
  219. try {
  220. URL realUrl = new URL(url);
  221. /*
  222. * http header 参数
  223. */
  224. String method = "DELETE";
  225. String accept = "application/json";
  226. String content_type = "application/json";
  227. String path = realUrl.getFile();
  228. String date = toGMTString(new Date());
  229. // 1.对body做MD5+BASE64加密
  230. String stringToSign = method + "\n" + accept + "\n" + "" + "\n" + content_type + "\n" + date ;
  231. // 2.计算 HMAC-SHA1
  232. String signature = HMACSha1(stringToSign, ak_secret);
  233. // 3.得到 authorization header
  234. String authHeader = "Dataplus " + ak_id + ":" + signature;
  235. // 打开和URL之间的连接
  236. conn = (HttpURLConnection) realUrl.openConnection();
  237. // 设置通用的请求属性
  238. conn.setRequestMethod(method);
  239. conn.setRequestProperty("accept", accept);
  240. conn.setRequestProperty("content-type", content_type);
  241. conn.setRequestProperty("Accept-Charset", "utf-8");
  242. conn.setRequestProperty("contentType", "utf-8");
  243. conn.setRequestProperty("date", date);
  244. conn.setRequestProperty("Authorization", authHeader);
  245. // 建立实际的连接
  246. conn.connect();
  247. int rc = conn.getResponseCode();
  248. if (rc == 200) {
  249. in = conn.getInputStream();
  250. } else {
  251. in = conn.getErrorStream();
  252. }
  253. result = changeInputStream(in, "UTF-8");
  254. } catch (Exception e) {
  255. e.printStackTrace();
  256. }
  257. // 使用finally块来关闭输出流、输入流
  258. finally {
  259. try {
  260. if (conn != null) {
  261. conn.disconnect();
  262. }
  263. if (out != null) {
  264. out.close();
  265. }
  266. if (in != null) {
  267. in.close();
  268. }
  269. } catch (IOException ex) {
  270. ex.printStackTrace();
  271. }
  272. }
  273. return result;
  274. }
  275. /*
  276. * 发送DELETE请求
  277. */
  278. public static String sendGet(String url, String ak_id, String ak_secret) {
  279. HttpURLConnection conn = null;
  280. PrintWriter out = null;
  281. InputStream in = null;
  282. String result = "";
  283. try {
  284. URL realUrl = new URL(url);
  285. /*
  286. * http header 参数
  287. */
  288. String method = "GET";
  289. String accept = "application/json";
  290. String content_type = "application/json";
  291. String path = realUrl.getFile();
  292. String date = toGMTString(new Date());
  293. // 1.对body做MD5+BASE64加密
  294. String stringToSign = method + "\n" + accept + "\n" + "" + "\n" + content_type + "\n" + date ;
  295. // 2.计算 HMAC-SHA1
  296. String signature = HMACSha1(stringToSign, ak_secret);
  297. // 3.得到 authorization header
  298. String authHeader = "Dataplus " + ak_id + ":" + signature;
  299. // 打开和URL之间的连接
  300. conn = (HttpURLConnection) realUrl.openConnection();
  301. // 设置通用的请求属性
  302. conn.setRequestMethod(method);
  303. conn.setRequestProperty("accept", accept);
  304. conn.setRequestProperty("content-type", content_type);
  305. conn.setRequestProperty("Accept-Charset", "utf-8");
  306. conn.setRequestProperty("contentType", "utf-8");
  307. conn.setRequestProperty("date", date);
  308. conn.setRequestProperty("Authorization", authHeader);
  309. // 建立实际的连接
  310. conn.connect();
  311. int rc = conn.getResponseCode();
  312. if (rc == 200) {
  313. in = conn.getInputStream();
  314. } else {
  315. in = conn.getErrorStream();
  316. }
  317. result = changeInputStream(in, "UTF-8");
  318. } catch (Exception e) {
  319. e.printStackTrace();
  320. }
  321. // 使用finally块来关闭输出流、输入流
  322. finally {
  323. try {
  324. if (conn != null) {
  325. conn.disconnect();
  326. }
  327. if (out != null) {
  328. out.close();
  329. }
  330. if (in != null) {
  331. in.close();
  332. }
  333. } catch (IOException ex) {
  334. ex.printStackTrace();
  335. }
  336. }
  337. return result;
  338. }
  339. /*
  340. * 发送POST请求
  341. */
  342. public static String sendNomalPost(String url, String body) {
  343. HttpURLConnection conn = null;
  344. PrintWriter out = null;
  345. InputStream in = null;
  346. String result = "";
  347. try {
  348. URL realUrl = new URL(url);
  349. /*
  350. * http header 参数
  351. */
  352. String method = "POST";
  353. String accept = "application/json";
  354. String content_type = "application/json";
  355. String path = realUrl.getFile();
  356. String date = toGMTString(new Date());
  357. // 打开和URL之间的连接
  358. conn = (HttpURLConnection) realUrl.openConnection();
  359. // 设置通用的请求属性
  360. conn.setRequestMethod(method);
  361. conn.setRequestProperty("accept", accept);
  362. conn.setRequestProperty("content-type", content_type);
  363. conn.setRequestProperty("Accept-Charset", "utf-8");
  364. conn.setRequestProperty("contentType", "utf-8");
  365. conn.setRequestProperty("date", date);
  366. // 发送POST请求必须设置如下两行
  367. conn.setDoOutput(true);
  368. conn.setDoInput(true);
  369. // 获取URLConnection对象对应的输出流
  370. out = new PrintWriter(conn.getOutputStream());
  371. // 发送请求参数
  372. out.print(body);
  373. // flush输出流的缓冲
  374. out.flush();
  375. int rc = conn.getResponseCode();
  376. if (rc == 200) {
  377. in = conn.getInputStream();
  378. } else {
  379. in = conn.getErrorStream();
  380. }
  381. result = changeInputStream(in, "UTF-8");
  382. } catch (Exception e) {
  383. e.printStackTrace();
  384. }
  385. // 使用finally块来关闭输出流、输入流
  386. finally {
  387. try {
  388. if (conn != null) {
  389. conn.disconnect();
  390. }
  391. if (out != null) {
  392. out.close();
  393. }
  394. if (in != null) {
  395. in.close();
  396. }
  397. } catch (IOException ex) {
  398. ex.printStackTrace();
  399. }
  400. }
  401. return result;
  402. }
  403. public static String changeInputStream(InputStream inputStream,
  404. String encode) {
  405. // ByteArrayOutputStream 一般叫做内存流
  406. ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
  407. byte[] data = new byte[1024];
  408. int len = 0;
  409. String result = "";
  410. if (inputStream != null) {
  411. try {
  412. while ((len = inputStream.read(data)) != -1) {
  413. byteArrayOutputStream.write(data, 0, len);
  414. }
  415. result = new String(byteArrayOutputStream.toByteArray(), encode);
  416. } catch (IOException e) {
  417. // TODO Auto-generated catch block
  418. e.printStackTrace();
  419. }
  420. }
  421. return result;
  422. }
  423. /*
  424. * GET请求
  425. */
  426. public static String sendGet(String url, String task_id,String ak_id, String ak_secret) {
  427. HttpURLConnection conn = null;
  428. String result = "";
  429. BufferedReader in = null;
  430. try {
  431. URL realUrl = new URL(url+"/"+task_id);
  432. /*
  433. * http header 参数
  434. */
  435. String method = "GET";
  436. String accept = "application/json";
  437. String content_type = "application/json";
  438. String path = realUrl.getFile();
  439. String date = toGMTString(new Date());
  440. // 1.对body做MD5+BASE64加密
  441. //String bodyMd5 = MD5Base64("");
  442. String stringToSign = method + "\n" + accept + "\n" + "" + "\n" + content_type + "\n" + date;
  443. // 2.计算 HMAC-SHA1
  444. String signature = HMACSha1(stringToSign, ak_secret);
  445. // 3.得到 authorization header
  446. String authHeader = "Dataplus " + ak_id + ":" + signature;
  447. // 打开和URL之间的连接
  448. conn = (HttpURLConnection) realUrl.openConnection();
  449. // 设置通用的请求属性
  450. conn.setRequestMethod(method);
  451. conn.setRequestProperty("accept", accept);
  452. conn.setRequestProperty("content-type", content_type);
  453. conn.setRequestProperty("Accept-Charset", "utf-8");
  454. conn.setRequestProperty("contentType", "utf-8");
  455. conn.setRequestProperty("date", date);
  456. conn.setRequestProperty("Authorization", authHeader);
  457. // 建立实际的连接
  458. conn.connect();
  459. // 定义 BufferedReader输入流来读取URL的响应
  460. in = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
  461. String line;
  462. while ((line = in.readLine()) != null) {
  463. result += line;
  464. }
  465. } catch (Exception e) {
  466. e.printStackTrace();
  467. }
  468. // 使用finally块来关闭输入流
  469. finally {
  470. try {
  471. if (conn != null) {
  472. conn.disconnect();
  473. }
  474. if (in != null) {
  475. in.close();
  476. }
  477. } catch (Exception e) {
  478. e.printStackTrace();
  479. }
  480. }
  481. return result;
  482. }
  483. public static void main(String[]args){
  484. String akId="your-access-key-id";
  485. String akSecret="your-access-key-secret";
  486. String url="https://nlsapi.aliyun.com/asr/custom/vocabs";
  487. String body="{\n"
  488. + " \"global_weight\": 1,\n"
  489. + " \"words\": [\n"
  490. + " \"猕猴桃\",\n"
  491. + " \"橘子\",\n"
  492. + " \"石榴\"\n"
  493. + " ],\n"
  494. + " \"word_weights\": {\n"
  495. + " \"橘子\": 2\n"
  496. + " }\n"
  497. + " }";
  498. String body2="{\n"
  499. + " \"global_weight\": 3,\n"
  500. + " \"words\": [\n"
  501. + " \"猕猴桃\",\n"
  502. + " \"橘子\",\n"
  503. + " \"葡萄\",\n"
  504. + " \"石榴\"\n"
  505. + " ],\n"
  506. + " \"word_weights\": {\n"
  507. + " \"橘子\": 2\n"
  508. + " }\n"
  509. + " }";
  510. //create
  511. String result=HttpUtil.sendPost(url,body,akId,akSecret);
  512. System.out.println("create result:"+result);//create result:{"request_id":"***","vocabulary_id":"###"}
  513. String vocabId=(String)JSONPath.read(result,"vocabulary_id");
  514. //update
  515. result=HttpUtil.sendPut(url+"/"+vocabId,body2,akId,akSecret);
  516. System.out.println("update result:"+result);//update result:{"request_id":"***"}
  517. //get
  518. result=HttpUtil.sendGet(url+"/"+vocabId,akId,akSecret);
  519. System.out.println("get result:"+result);//get result:{"request_id":"***","global_weight":3,"words":["猕猴桃","橘子","葡萄","石榴"],"word_weights":{"橘子":2}}
  520. //delete
  521. result=HttpUtil.sendDelete(url+"/"+vocabId,akId,akSecret);
  522. System.out.println("delete result:"+result);//delete result:{"request_id":"***"}
  523. }
  524. }

附:使用词表示例

  1. NlsRequest req = new NlsRequest();
  2. ...
  3. // 使用热词功能,指定热词词表id即可
  4. req.setAsrVocabularyId("热词词表id");
  5. ...
本文导读目录