Paraformer语音识别热词定制与管理
支持的领域 / 任务:audio(音频) / asr(语音识别)
在语音识别服务中,如果您的业务领域有部分词汇默认识别效果不够好,可以考虑使用热词功能,将这些词添加到词表从而改善识别结果。
前提条件
已开通服务并获得API-KEY:API-KEY的获取与配置。
已安装最新版SDK:安装DashScope SDK。
热词
热词通过热词列表的形式在SDK中使用,热词列表是一个以热词文本为Key,热词权重为Value的字典。热词列表最大支持设置500个热词,热词文本规则如下:纯中文热词不超过10个汉字,纯英文或者中英文混合热词,按空格分词后,不超过5个词;对于热词权重规则如下:有效的热词权重取值范围为[1, 5]和[-6, -1]区间内的整数值。如果想提高某个热词的识别概率,则可以设置[1, 5]范围内的权重,权重越大概率越高;如果想降低某个热词的识别概率,则可以设置[-6, -1]范围内的权重,权重越小概率越低。
热词管理
在Java和Python中使用AsrPhraseManager类来管理热词的创建,更新,删除,查询等功能。
导入
from dashscope.audio.asr import AsrPhraseManager
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
创建热词
同步调用的形式提交一个创建热词请求。
接口
def create_phrases(cls,
model: str,
phrases: Dict[str, Any],
training_type: str = 'compile_asr_phrase',
**kwargs)
AsrPhraseStatusResult CreatePhrases(AsrPhraseParam param)
throws ApiException, NoApiKeyException, InputRequiredException
参数说明
对于Java SDK,将使用一个AsrPhraseParam对象作为参数,其方法和参数如下:
参数 | 类型 | 说明 |
param | AsrPhraseParam | 创建热词的配置参数,见上文关于AsrPhraseParam类型的描述,CreatePhrases调用不需要填写pageNo和pageSize字段,但是要求添加model,phraseList字段。 |
对于Python SDK,其参数说明如下:
参数 | 类型 | 说明 |
model | str | 指定的Paraformer模型名,关于如何进行模型选择,请参考:模型概览。 |
phrases | Dict[str, Any] | 热词列表。 |
training_type | str | 固定为compile_asr_phrase。 |
模型概览
模型名 | 模型简介 |
paraformer-realtime-v1 | Paraformer中文实时语音识别模型,支持视频直播、会议等实时场景下的语音识别。仅支持16kHz采样率的音频。 |
paraformer-realtime-8k-v1 | Paraformer中文实时语音识别模型,支持8kHz电话客服等场景下的实时语音识别。 |
Paraformer中英文语音识别模型,支持16kHz及以上采样率的音频或视频语音识别。 | |
paraformer-8k-v1 | Paraformer中文语音识别模型,支持8kHz电话语音识别。 |
paraformer-mtl-v1 | Paraformer多语言语音识别模型,支持16kHz及以上采样率的音频或视频语音识别。 支持的语种/方言包括:中文普通话、中文方言(粤语、吴语、闽南语、东北话、甘肃话、贵州话、河南话、湖北话、湖南话、宁夏话、山西话、陕西话、山东话、四川话、天津话)、英语、日语、韩语、西班牙语、印尼语、法语、德语、意大利语、马来语。 |
返回示例
对于Java SDK,将返回一个AsrPhraseStatusResult对象,对于Python SDK,将返回一个Dict,AsrPhraseStatusResult成员通过对应get方法获取,成员名称和Python SDK基本一致,仅命名方式不同(Java为驼峰式)。
{
"status_code": 200,
"request_id": "2b815cfe-793f-9f3c-b528-5ade0a2d498e",
"code": null,
"message": "",
"output": {
"job_id": "ft-202309261539-2af1",
"status": "SUCCEEDED",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"training_type": "compile_asr_phrase",
"create_time": "2023-09-26 15:39:07"
},
"usage": null,
"job_id": "ft-202309261539-2af1",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:39:07",
"output_type": null,
"model": null
}
调用范例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
phrases = {'通义千问': 5}
result = AsrPhraseManager.create_phrases(model='paraformer-realtime-v1',
phrases=phrases)
if result.output is not None and result.output['finetuned_output'] is not None:
print('job_id:%s, finetuned_output:%s' %
(result.output['job_id'], result.output['finetuned_output']))
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
import java.util.Collections;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.phraseList(Collections.singletonMap("通义千问", 5))
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult createResult = null;
try {
createResult = AsrPhraseManager.CreatePhrases(param);
if (createResult.getOutput() != null && createResult.getOutput().getFineTunedOutput() != null) {
System.out.println("job_id: " + createResult.getOutput().getJobId() + ", finetuned_output: " + createResult.getOutput().getFineTunedOutput());
} else {
System.out.println("Error: " + createResult);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
查询热词
该接口将以同步调用的形式提交一个查询热词请求。
接口
def query_phrases(cls, phrase_id: str, **kwargs)
AsrPhraseStatusResult QueryPhrase(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
参数配置
对于Java SDK,将使用一个AsrPhraseParam对象作为参数,其方法和参数如下:
参数 | 类型 | 说明 |
param | AsrPhraseParam | 创建热词的配置参数,见上文关于AsrPhraseParam类型的描述,对于QueryPhrase调用,不需要填phraseLIst及pageNo和pageSize字段。 |
phraseId | String | 调用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput对象后,通过该对象的getFineTunedOutput返回的热词ID,String类型。调用ListPhrases时,使用AsrPhraseStatusOutput对象的getFinetunedOutputs接口将返回所有热词信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可获取对应热词的热词ID。 |
对于Python SDK,参数说明如下:
参数 | 类型 | 说明 |
phrase_id | str | 调用create_phrases,update_phrases, query_phrases等接口返回的Dict对象,通过finetuned_output访问对应phrase_id |
返回示例
{
"status_code": 200,
"request_id": "19ee9c5f-173b-9fed-8e61-40bc53f1eea7",
"code": null,
"message": "",
"output": {
"create_time": "2023-09-26 15:39:08",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"job_id": "ft-202309261539-2af1",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
},
"usage": null,
"job_id": "ft-202309261539-2af1",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:39:08",
"output_type": "custom_resource",
"model": "paraformer-realtime-v1"
}
调用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.query_phrases(phrase_id='phrase-id')
if result.output is not None:
print('query phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.QueryPhrase(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("query phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
更新热词
该接口将以同步调用的形式提交一个更新热词请求。
接口
def update_phrases(cls,
model: str,
phrase_id: str,
phrases: Dict[str, Any],
training_type: str = 'compile_asr_phrase',
**kwargs)
AsrPhraseStatusResult UpdatePhrases(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
参数配置
对于Java SDK,将使用一个AsrPhraseParam对象作为参数,其方法和参数如下:
参数 | 类型 | 说明 |
param | AsrPhraseParam | 创建热词的配置参数,见上文关于AsrPhraseParam类型的描述,对于UpdatePhrase调用,不需要填pageNo和pageSize字段。 |
phraseId | String | 调用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput对象后,通过该对象的getFineTunedOutput返回的热词ID,String类型。调用ListPhrases时,使用AsrPhraseStatusOutput对象的getFinetunedOutputs接口将返回所有热词信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可获取对应热词的热词ID。 |
对于Python SDK,其参数如下:
参数 | 类型 | 默认值 | 说明 |
model | str | - | 指定用于音视频文件转写的Paraformer模型名,关于如何进行模型选择,请参考:模型概览。 |
phrase_id | str | - | 调用create_phrases,update_phrases, query_phrases等接口返回的Dict对象,通过finetuned_output访问对应phrase_id。 |
phrases | Dict[str, Any] | - | 热词列表,是一个Dict类型对象,其中键为热词文本,值为热词对应权重。对于热词要求请参考下方重要一栏。 |
training_type | str | compile_asr_phrase | 固定为compile_asr_phrase |
返回示例
{
"status_code": 200,
"request_id": "8c8d64e3-5198-9624-99cd-e9dcf7eb22f6",
"code": null,
"message": "",
"output": {
"job_id": "ft-202309261543-b0ae",
"status": "SUCCEEDED",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"training_type": "compile_asr_phrase",
"create_time": "2023-09-26 15:43:09"
},
"usage": null,
"job_id": "ft-202309261543-b0ae",
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": "2023-09-26 15:43:09",
"output_type": null,
"model": null
}
调用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
phrases = {'通义千问': 2}
result = AsrPhraseManager.update_phrases(model='paraformer-realtime-v1',
phrase_id='phrase-id',
phrases=phrases)
if result.output is not None:
print('update phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
import import java.util.Collections;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.phraseList(Collections.singletonMap("通义千问", 2))
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.UpdatePhrases(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("update phrases: " + result.getOutput());
} else {
System.out.println("err: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
删除热词
该接口将以同步调用的形式提交一个热词删除请求。
接口
def delete_phrases(cls, phrase_id: str,
**kwargs)
AsrPhraseStatusResult DeletePhrase(AsrPhraseParam param, String phraseId)
throws ApiException, NoApiKeyException, InputRequiredException
参数配置
对于Java SDK,将使用一个AsrPhraseParam对象作为参数,其方法和参数如下:
参数 | 类型 | 说明 |
param | AsrPhraseParam | 创建热词的配置参数,见上文关于AsrPhraseParam类型的描述,对于DeletePhrase调用,不需要填phraseList, pageNo和pageSize字段。 |
phraseId | String | 调用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput对象后,通过该对象的getFineTunedOutput返回的热词ID,String类型。调用ListPhrases时,使用AsrPhraseStatusOutput对象的getFinetunedOutputs接口将返回所有热词信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可获取对应热词的热词ID。 |
对于Python SDK,其参数如下:
参数 | 类型 | 默认值 | 说明 |
phrase_id | str | - | 调用create_phrases,update_phrases, query_phrases等接口返回的Dict对象,通过finetuned_output访问对应phrase_id。 |
返回示例
{
"status_code": 200,
"request_id": "00bb0287-2593-94a3-8e21-93c90f5e9dd8",
"code": null,
"message": "",
"output": {
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1"
},
"usage": null,
"job_id": null,
"finetuned_output": "paraformer-realtime-v1-ft-202309261539-2af1",
"finetuned_outputs": null,
"training_type": null,
"create_time": null,
"output_type": null,
"model": null
}
调用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.delete_phrases(phrase_id='phrase-id')
if result.output is not None:
print('delete phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.DeletePhrase(param, "phrase-id");
if (result.getOutput() != null) {
System.out.println("delete phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
列表形式返回所有热词
该接口将以同步调用的形式提交一个返回所有热词的请求。
接口
def list_phrases(cls,
page: int = 1,
page_size: int = 10,
**kwargs)
AsrPhraseStatusResult ListPhrases(AsrPhraseParam param)
throws ApiException, NoApiKeyException, InputRequiredException
参数配置
对于Java SDK,将使用一个AsrPhraseParam对象作为参数,其方法和参数如下:
参数 | 类型 | 说明 |
param | AsrPhraseParam | 创建热词的配置参数,见上文关于AsrPhraseParam类型的描述,对于ListPhrases调用,不需要填phraseList字段。 |
phraseId | String | 调用CreatePhrases,UpdatePhrases, QueryPhrase等接口返回的AsrPhraseStatusOutput对象后,通过该对象的getFineTunedOutput返回的热词ID,String类型。调用ListPhrases时,使用AsrPhraseStatusOutput对象的getFinetunedOutputs接口将返回所有热词信息的列表,然后使用AsrPhraseInfo的getFineTunedOutput即可获取对应热词的热词ID。 |
对于Python SDK,其参数如下:
参数 | 类型 | 默认值 | 说明 |
page | int | 1 | 当请求是list_phrases有效,用于查询第几页列表,默认1 |
page_size | int | 10 | 当请求是list_phrases有效,用于设置分页大小,默认10 |
返回示例
{
"status_code": 200,
"request_id": "95f969ef-bcbc-9bb5-b05d-4caed9326409",
"code": null,
"message": "",
"output": {
"page_no": 1,
"page_size": 5,
"total": 61,
"finetuned_outputs": [{
"create_time": "2023-09-26 15:32:20",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-93bf",
"job_id": "ft-202309261532-93bf",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:18",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-7b51",
"job_id": "ft-202309261532-7b51",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:17",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-8bbc",
"job_id": "ft-202309261532-cef0",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:16",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-fc6a",
"job_id": "ft-202309261532-fc6a",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:31:56",
"finetuned_output": "paraformer-realtime-v1-ft-202309261531-e92d",
"job_id": "ft-202309261531-e92d",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}]
},
"usage": null,
"job_id": null,
"finetuned_output": null,
"finetuned_outputs": [{
"create_time": "2023-09-26 15:32:20",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-93bf",
"job_id": "ft-202309261532-93bf",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:18",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-7b51",
"job_id": "ft-202309261532-7b51",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:17",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-8bbc",
"job_id": "ft-202309261532-cef0",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:32:16",
"finetuned_output": "paraformer-realtime-v1-ft-202309261532-fc6a",
"job_id": "ft-202309261532-fc6a",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}, {
"create_time": "2023-09-26 15:31:56",
"finetuned_output": "paraformer-realtime-v1-ft-202309261531-e92d",
"job_id": "ft-202309261531-e92d",
"model": "paraformer-realtime-v1",
"output_type": "custom_resource"
}],
"training_type": null,
"create_time": null,
"output_type": null,
"model": null
}
调用示例
# coding=utf-8
import dashscope
from dashscope.audio.asr import AsrPhraseManager
dashscope.api_key='your-dashscope-api-key'
result = AsrPhraseManager.list_phrases()
if result.output is not None:
print('list phrases: ', result.output)
else:
print('Error: ', str(result))
package com.alibaba.test;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseManager;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseParam;
import com.alibaba.dashscope.audio.asr.phrase.AsrPhraseStatusResult;
import com.alibaba.dashscope.audio.asr.recognition.Recognition;
import com.alibaba.dashscope.audio.asr.recognition.RecognitionParam;
import com.alibaba.dashscope.utils.Constants;
class Test {
public static void main(String[] args) {
AsrPhraseParam param = AsrPhraseParam.builder()
.model("your-model")
.apiKey("your-dashscope-api-key")
.build();
AsrPhraseStatusResult result = null;
try {
result = AsrPhraseManager.ListPhrases(param);
if (result.getOutput() != null) {
System.out.println("list phrases: " + result.getOutput());
} else {
System.out.println("Error: " + result);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}