Custom linguistic models improve ASR accuracy for domain-specific terminology. Upload training corpora to teach the ASR engine vocabulary unique to your industry, such as medical terms, financial jargon, or product names.
The API lifecycle includes creating training datasets, building custom linguistic models, and deploying them in your speech recognition projects.
Prerequisites
Before you begin, ensure that you have:
-
An activated Intelligent Speech Interaction service. Create a custom linguistic model
-
An AccessKey ID and AccessKey secret for authentication. Activate Intelligent Speech Interaction
-
A project configured in the Intelligent Speech Interaction console with the ASR model set to the base model you plan to use for training
On the Project Settings page, set the ASR model to the same base model specified by the BaseId parameter when you create a custom linguistic model.
API endpoint
| Property | Value |
|---|---|
| Region ID |
|
| Domain name |
|
| Protocol | HTTPS |
| API version | 2018-11-20 |
Limits
| Resource | Limit |
|---|---|
| Custom linguistic models per account | 10 (free of charge) |
| Training datasets per account | 100 |
| Maximum size per training dataset | 10 MB |
| Training datasets per model | 10 |
Training time estimates:
-
Total dataset size <= 10 MB: approximately 5 minutes
-
Total dataset size > 10 MB: approximately 30 minutes
For corpus format requirements and optimization tips, refer to the "Notes on training corpora" section in Overview.
Workflow
To create and deploy a custom linguistic model, follow these steps:
-
Prepare training data. Upload your training corpus to an accessible URL (HTTP or HTTPS URL of an OSS file), then call
CreateAsrLmDatato create a training dataset. Poll the dataset status until it reachesReady. -
Create a model. Call
CreateAsrLmModelwith a base model ID. Then callAddDataToAsrLmModelto attach one or more training datasets. -
Train and deploy. Call
TrainAsrLmModelto start training. After training completes, the system automatically deploys the model -- the status changes toDeployedwithout a separate deploy call.
Training dataset statuses
| Status | Description |
|---|---|
Fetching |
Corpus data is being imported from the specified URL |
FetchingFailed |
Import failed. Verify that the URL is a valid HTTP or HTTPS URL of an OSS file |
Ready |
Corpus data imported successfully |
Model statuses
| Status | Description |
|---|---|
Empty |
Model created, not yet trained |
Training |
Training in progress |
TrainingFailed |
Training failed |
Ready |
Trained but not deployed |
Deploying |
Deployment or undeployment in progress |
Deployed |
Deployed and active |
API reference
Training dataset operations
CreateAsrLmData
Creates a training dataset by importing corpus data from the specified URL.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
Name |
String | Yes | Name of the training dataset |
Url |
String | Yes | URL of the training corpus file. Only HTTP URLs and HTTPS URLs of OSS files are supported |
Description |
String | No | Description of the training dataset |
Response example:
{
"RequestId": "C71B7CAA-18D6-4012-AC3D-425BA1CB****",
"DataId": "9934e10f19044282825508cbc7c8****"
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
DataId |
String | ID of the created training dataset. Use this ID in subsequent operations such as AddDataToAsrLmModel |
After calling this operation, poll the dataset status with GetAsrLmData until the status changes to Ready. A Ready status indicates successful corpus import.
GetAsrLmData
Queries the details of a training dataset.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
DataId |
String | Yes | ID of the training dataset to query |
Response example:
{
"Data": {
"Name": "TestTrainingDataset",
"Status": "Ready",
"Md5": "38fc072ac60796a84ce1a0b13f78****",
"Description": "The training dataset is created by using the API.",
"Url": "https://aliyun-nls.oss-ap-southeast-1.aliyuncs.com/asr/fileASR/SLP/SLPTest.txt",
"CreateTime": "2019-02-11 14:40:35",
"UpdateTime": "2019-02-11 14:40:35",
"Id": "9934e10f19044282825508cbc7c8****",
"Size": 5991
},
"RequestId": "C88130E6-F3B5-4F3E-9BF5-9C617DDD****"
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
Data |
Object | Training dataset details (see below) |
Data object:
| Parameter | Type | Description |
|---|---|---|
Id |
String | Dataset ID (same as DataId returned by CreateAsrLmData) |
Name |
String | Dataset name |
Description |
String | Dataset description |
Size |
Integer | Dataset size in bytes |
Md5 |
String | MD5 hash of the dataset |
Url |
String | URL of the corpus file specified during creation |
Status |
String | Dataset status. Valid values: Fetching, FetchingFailed, Ready |
CreateTime |
String | Creation time |
UpdateTime |
String | Last update time |
ErrorMessage |
String | Error message (returned only on failure) |
DeleteAsrLmData
Deletes a training dataset.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
DataId |
String | Yes | ID of the training dataset to delete |
Response example:
{
"RequestId": "7130914d32a3441db06747523675d9ff"
}
Only datasets in the Ready state can be deleted.
ListAsrLmData
Lists training datasets with pagination.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
PageNumber |
Int | No | Page number, starting from 1. Default: 1 |
PageSize |
Int | No | Entries per page. Valid values: 10 to 100. Default: 10 |
ModelId |
String | No | Filter by model ID to list only datasets attached to the specified model. If omitted, all accessible datasets are returned |
Response example:
{
"RequestId": "7130914d32a3441db06747523675d9ff",
"Page": {
"Content": [{
"Id": "1b64bee9994749f2a67eadac6379****",
"Name": "SampleTrainingDataset",
"Description": "This is a sample training dataset.",
"Size": 7777404,
"Md5": "39326cf690e384735355a385ec1e****",
"Url": "slp/tmp/demo-data-lm.txt",
"Status": "Ready",
"CreateTime": "2018-10-31 17:20:39",
"UpdateTime": "2018-10-31 17:20:39"
}],
"TotalPages": 1,
"TotalItems": 1,
"PageNumber": 1,
"PageSize": 10
}
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
Page |
Object | Paginated result (see below) |
Page object:
| Parameter | Type | Description |
|---|---|---|
Content |
List\ | List of training datasets. Each entry follows the Data object structure from GetAsrLmData |
TotalPages |
Integer | Total number of pages |
TotalItems |
Integer | Total number of datasets |
PageNumber |
Integer | Current page number |
PageSize |
Integer | Page size |
Model operations
CreateAsrLmModel
Creates a custom linguistic model based on a specified base model.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
Name |
String | Yes | Model name |
BaseId |
String | Yes | Base model ID. Cannot be changed after creation. Set the project ASR model to the same base model in the Intelligent Speech Interaction console |
Description |
String | No | Model description |
Available base models:
| Model name | BaseId |
|---|---|
| Universal Chinese Language Recognition Model (Mandarin, 16,000 Hz) | universal |
| Chinese Telephone Customer Service Recognition Model (Mandarin, 8,000 Hz) | customer_service_8k |
| Cantonese Telephone Customer Service Recognition Model (Cantonese, 8,000 Hz) | cantonese_customer_service_8k |
| English speech recognition model (English, 16,000 Hz) | english |
For a full list of base models, see the Select ASR Model page in the Intelligent Speech Interaction console.
Response example:
{
"ModelId": "dbb6b71ff3e54b45a600ee5157a2****",
"RequestId": "945C59DF-B3D9-4F22-808E-76752FF3****"
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
ModelId |
String | ID of the created model. Use this ID in subsequent operations such as TrainAsrLmModel |
A newly created model starts in the Empty state.
GetAsrLmModel
Queries the details of a custom linguistic model.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model to query |
Response example:
{
"Model": {
"Name": "TestLinguisticModel",
"Status": "Empty",
"Description": "This is a sample description.",
"CreateTime": "2019-02-12 10:11:57",
"UpdateTime": "2019-02-12 10:11:57",
"Id": "dbb6b71ff3e54b45a600ee5157a2****",
"BaseId": "common",
"Size": 0
},
"RequestId": "6CE24FF7-B7C8-4B9F-B0EB-FE4AF20B****"
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
Model |
Object | Model details (see below) |
Model object:
| Parameter | Type | Description |
|---|---|---|
Id |
String | Model ID (same as ModelId returned by CreateAsrLmModel) |
Name |
String | Model name |
Description |
String | Model description |
BaseId |
String | Base model ID |
Size |
Integer | Model size |
Status |
String | Model status. Valid values: Empty, Training, TrainingFailed, Ready, Deploying, Deployed |
CreateTime |
String | Creation time |
UpdateTime |
String | Last update time |
ErrorMessage |
String | Error message (returned only on failure) |
DeleteAsrLmModel
Deletes a custom linguistic model.
Before deleting a model, make sure it is not in use by any application. A deleted model immediately becomes inactive.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model to delete |
Response example:
{
"RequestId": "7130914d32a3441db06747523675d9ff"
}
Models in the Training or Deploying state cannot be deleted.
ListAsrLmModel
Lists custom linguistic models with pagination.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
PageNumber |
Int | No | Page number, starting from 1. Default: 1 |
PageSize |
Int | No | Entries per page. Valid values: 10 to 100. Default: 10 |
DataId |
String | No | Filter by dataset ID to list only models that use the specified dataset. If omitted, all accessible models are returned |
Response example:
{
"RequestId": "7130914d32a3441db06747523675****",
"Page": {
"Content": [{
"Id": "demo-model",
"Name": "SampleLinguisticModel",
"Description": "This is a test model.",
"Size": 0,
"Status": "Empty",
"CreateTime": "2018-11-01 17:05:21",
"UpdateTime": "2018-11-01 17:05:21",
"BaseId": "common"
}],
"TotalPages": 1,
"TotalItems": 1,
"PageNumber": 1,
"PageSize": 10
}
}
Response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
Page |
Object | Paginated result (see below) |
Page object:
| Parameter | Type | Description |
|---|---|---|
Content |
List\ |
List of models. Each entry follows the Model object structure from GetAsrLmModel |
TotalPages |
Integer | Total number of pages |
TotalItems |
Integer | Total number of models |
PageNumber |
Integer | Current page number |
PageSize |
Integer | Page size |
Training and deployment operations
AddDataToAsrLmModel
Attaches a training dataset to a custom linguistic model.
The same dataset cannot be added to the same model twice.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the target model |
DataId |
String | Yes | ID of the dataset to attach |
Response example:
{
"RequestId": "9B232563-12C0-4242-AA27-C250E1BB****"
}
The dataset must be in the Ready state. The model cannot be in the Training or Deploying state.
RemoveDataFromAsrLmModel
Detaches a training dataset from a custom linguistic model.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model |
DataId |
String | Yes | ID of the dataset to detach |
Response example:
{
"RequestId": "7130914d32a3441db06747523675****"
}
Datasets cannot be removed from a model in the Training or Deploying state.
TrainAsrLmModel
Starts training a custom linguistic model. After training completes, the system automatically deploys the model.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model to train |
Response example:
{
"RequestId": "3D922A91-68AA-4260-AFE4-C429832F****"
}
Models in the Deploying state cannot be trained.
After training completes successfully, the model status transitions to Deployed automatically. A separate DeployAsrLmModel call is not required.
DeployAsrLmModel
Deploys (publishes) a custom linguistic model. Use this operation to redeploy a model that was previously undeployed.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model to deploy |
Response example:
{
"RequestId": "D9DDA978-5D68-45A4-B840-E4BC45C7****"
}
Only models in the Ready state can be deployed.
UndeployAsrLmModel
Undeploys (unpublishes) a custom linguistic model.
Before undeploying a model, make sure it is not in use by any application. An undeployed model immediately becomes inactive.
Request parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
ModelId |
String | Yes | ID of the model to undeploy |
Response example:
{
"RequestId": "8417BA8E-2428-41D2-A849-396A0897****"
}
Only models in the Deployed state can be undeployed.
Allowed operations by state
A dash (-) means the operation is allowed. NO means the operation is blocked in that state.
| Operation | Fetching | FetchingFailed | Ready (Dataset) | Empty | Training | TrainingFailed | Ready (Model) | Deploying | Deployed |
|---|---|---|---|---|---|---|---|---|---|
| CreateAsrLmData | - | - | - | - | - | - | - | - | - |
| ListAsrLmData | - | - | - | - | - | - | - | - | - |
| GetAsrLmData | - | - | - | - | - | - | - | - | - |
| DeleteAsrLmData | NO | NO | - | - | NO | - | - | - | - |
| CreateAsrLmModel | - | - | - | - | - | - | - | - | - |
| ListAsrLmModel | - | - | - | - | - | - | - | - | - |
| GetAsrLmModel | - | - | - | - | - | - | - | - | - |
| DeleteAsrLmModel | - | - | - | - | NO | - | - | NO | - |
| AddDataToAsrLmModel | NO | NO | - | - | NO | - | - | NO | - |
| RemoveDataFromAsrLmModel | - | - | - | - | NO | - | - | - | - |
| TrainAsrLmModel | - | - | - | - | - | - | - | NO | - |
| DeployAsrLmModel | - | - | - | NO | NO | NO | - | NO | NO |
| UndeployAsrLmModel | - | - | - | NO | NO | NO | NO | NO | - |
Error codes
When an API call fails, the response body contains an error code and message.
Error response example:
{
"RequestId": "E70F51F6-23E3-4681-B954-ABF32B89****",
"HostId": "nls-slp.ap-southeast-1.aliyuncs.com",
"Code": "SLP.NOT_FOUND",
"Message": "Model not found!"
}
Error response parameters:
| Parameter | Type | Description |
|---|---|---|
RequestId |
String | Request ID |
HostId |
String | Endpoint of the self-learning platform server |
Code |
String | Error code |
Message |
String | Error message |
Error codes:
| Error code | Description | Troubleshooting |
|---|---|---|
SLP.ASR_MODEL_ERROR |
An error related to the custom linguistic model occurred | Check the model status and retry the operation |
SLP.NOT_FOUND |
The specified resource ID is invalid | Verify the DataId or ModelId value. Call ListAsrLmData or ListAsrLmModel to get valid IDs |
SLP.PARAMETER_ERROR |
Invalid parameter values | Check the request parameters against the API documentation |
SLP.EXCEED_LIMIT |
Resource count reached the upper limit | Check your current resource count with ListAsrLmData or ListAsrLmModel. Delete unused resources to free capacity |
Sample code
The following Java sample demonstrates the complete lifecycle: creating a training dataset, building a model, training, deploying, and cleaning up.
All examples use the CommonRequest method from Alibaba Cloud SDK for Java. Authentication requires your AccessKey ID and AccessKey secret.
Dependencies
Add the following Maven dependencies. The Alibaba Cloud SDK for Java core library must be version 3.5.0 through 3.7.x.
<dependency>
<groupId>com.aliyun</groupId>
<artifactId>aliyun-java-sdk-core</artifactId>
<version>3.7.1</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.83</version>
</dependency>
Class overview
The sample includes three classes:
| Class | Purpose |
|---|---|
AsrLmModelPopApiDemo |
Main class. Orchestrates the full lifecycle and polls resource statuses during API calls |
AsrLmData |
Encapsulates training dataset data objects and API operations |
AsrLmModel |
Encapsulates custom linguistic model data objects and API operations |
AsrLmModelPopApiDemo
import com.alibaba.fastjson.JSONObject;
import com.aliyuncs.DefaultAcsClient;
import com.aliyuncs.IAcsClient;
import com.aliyuncs.profile.DefaultProfile;
public class AsrLmModelPopApiDemo {
private static String REGION = "ap-southeast-1";
private static final String STATUS_FETCHING = "Fetching";
private static final String STATUS_FETCHINGFAILED = "FetchingFailed";
private static final String STATUS_READY = "Ready";
private static final String STATUS_EMPTY = "Empty";
private static final String STATUS_TRAINING = "Training";
private static final String STATUS_TRAININGFAILED = "TrainingFailed";
private static final String STATUS_DEPLOYING = "Deploying";
private static final String STATUS_DEPLOYED = "Deployed";
private static IAcsClient client;
private AsrLmData asrLmData;
private AsrLmModel asrLmModel;
public AsrLmModelPopApiDemo(String akId, String akSecret) {
DefaultProfile profile = DefaultProfile.getProfile(REGION, akId, akSecret);
client = new DefaultAcsClient(profile);
asrLmData = new AsrLmData(client);
asrLmModel = new AsrLmModel(client);
}
/******************************* Manage training datasets *******************************/
// Create a training dataset.
public String createAsrLmData(String name, String fileUrl, String description) {
String dataId = asrLmData.createAsrLmData(name, fileUrl, description);
if (null == dataId) {
return dataId;
}
// Poll until the dataset reaches Ready status.
while (true) {
AsrLmData.LmData data = asrLmData.getAsrLmData(dataId);
if (null == data) {
dataId = null;
break;
}
if (data.Status.equals(STATUS_FETCHING)) {
System.out.println("Importing corpus data. Dataset ID: " + dataId);
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
else if (data.Status.equals(STATUS_FETCHINGFAILED)) {
System.out.println("Corpus import failed. Dataset ID: " + dataId);
asrLmData.deleteAsrLmData(dataId);
dataId = null;
break;
}
else if (data.Status.equals(STATUS_READY)) {
System.out.println("Corpus imported. Dataset ID: " + dataId);
break;
}
}
return dataId;
}
// Query a training dataset.
public AsrLmData.LmData getAsrLmData(String dataId) {
return asrLmData.getAsrLmData(dataId);
}
// Delete a training dataset.
public boolean deleteAsrLmData(String dataId) {
AsrLmData.LmData data = asrLmData.getAsrLmData(dataId);
if (null == data) {
return false;
}
if (!data.Status.equals(STATUS_READY)) {
System.out.println("Cannot delete dataset in current state: " + data.Status);
return false;
}
return asrLmData.deleteAsrLmData(dataId);
}
// List training datasets.
public AsrLmData.LmDataPage listAsrLmData() {
return asrLmData.listAsrLmData();
}
/******************************* Manage custom linguistic models *******************************/
// Create a custom linguistic model.
public String createAsrLmModel(String name, String baseId, String description) {
String modelId = asrLmModel.createAsrLmModel(name, baseId, description);
if (null == modelId) {
return modelId;
}
// Verify that the model enters the Empty state.
while (true) {
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
modelId = null;
break;
}
if (model.Status.equals(STATUS_EMPTY)) {
break;
}
else {
System.out.println("Model creation failed. Model ID: " + modelId);
asrLmModel.deleteAsrLmModel(modelId);
modelId = null;
break;
}
}
return modelId;
}
// Query a custom linguistic model.
public AsrLmModel.LmModel getAsrLmModel(String modelId) {
return asrLmModel.getAsrLmModel(modelId);
}
// Delete a custom linguistic model.
public boolean deleteAsrLmModel(String modelId) {
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (model.Status.equals(STATUS_TRAINING) || model.Status.equals(STATUS_DEPLOYING)) {
System.out.println("Cannot delete model in current state: " + model.Status);
return false;
}
return asrLmModel.deleteAsrLmModel(modelId);
}
// List custom linguistic models.
public AsrLmModel.LmModelPage listAsrLmModel() {
return asrLmModel.listAsrLmModel();
}
/**************************** Train and deploy ***************************/
// Add a training dataset to a custom linguistic model.
public boolean addDataToAsrLmModel(String dataId, String modelId) {
AsrLmData.LmData data = asrLmData.getAsrLmData(dataId);
if (null == data) {
return false;
}
if (!data.Status.equals(STATUS_READY)) {
System.out.println("Dataset not ready: " + data.Status);
return false;
}
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (model.Status.equals(STATUS_TRAINING) || model.Status.equals(STATUS_DEPLOYING)) {
System.out.println("Cannot add dataset while model is in state: " + model.Status);
return false;
}
return asrLmModel.addDataToAsrLmModel(dataId, modelId);
}
// Remove a training dataset from a custom linguistic model.
public boolean removeDataFromAsrLmModel(String dataId, String modelId) {
// Verify that the dataset is attached to this model.
boolean isAdded = false;
AsrLmData.LmDataPage page = asrLmData.listAsrLmData(1, 10, modelId);
if (page != null && page.Content.size() > 0) {
for (int i = 0; i < page.Content.size(); i++) {
if (dataId.equals(page.Content.get(i).Id)) {
isAdded = true;
break;
}
}
}
if (!isAdded) {
System.out.println("Dataset is not attached to this model.");
return false;
}
// Check model state.
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (model.Status.equals(STATUS_TRAINING)) {
System.out.println("Cannot remove dataset while model is training.");
return false;
}
return asrLmModel.removeDataFromAsrLmModel(dataId, modelId);
}
// Train a custom linguistic model.
public boolean trainAsrLmModel(String modelId) {
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (model.Status.equals(STATUS_DEPLOYING)) {
System.out.println("Cannot train model in current state: " + model.Status);
return false;
}
boolean isTrain = asrLmModel.trainAsrLmModel(modelId);
if (!isTrain) {
return isTrain;
}
// Poll until the model reaches the Deployed state.
while (true) {
model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
isTrain = false;
break;
}
if (model.Status.equals(STATUS_TRAINING) || model.Status.equals(STATUS_DEPLOYING)) {
if (model.Status.equals(STATUS_TRAINING)) {
System.out.println("Training in progress. Model ID: " + modelId);
}
else {
System.out.println("Deploying. Model ID: " + modelId);
}
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
else if (model.Status.equals(STATUS_TRAININGFAILED)) {
System.out.println("Training failed. Model ID: " + modelId);
isTrain = false;
break;
}
else if (model.Status.equals(STATUS_DEPLOYED)) {
System.out.println("Training complete, model deployed. Model ID: " + modelId);
isTrain = true;
break;
}
else {
System.out.println("Unexpected model state: " + model.Status);
isTrain = false;
break;
}
}
return isTrain;
}
// Deploy a custom linguistic model.
public boolean deployAsrLmModel(String modelId) {
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (!model.Status.equals(STATUS_READY)) {
System.out.println("Cannot deploy model in current state: " + model.Status);
return false;
}
boolean isDeployed = asrLmModel.deployAsrLmModel(modelId);
if (!isDeployed) {
return isDeployed;
}
// Poll until the model reaches the Deployed state.
while (true) {
model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
isDeployed = false;
break;
}
if (model.Status.equals(STATUS_DEPLOYING)) {
System.out.println("Deploying. Model ID: " + modelId);
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
else if (model.Status.equals(STATUS_DEPLOYED)) {
System.out.println("Model deployed. Model ID: " + modelId);
isDeployed = true;
break;
}
else {
System.out.println("Cannot deploy model in current state: " + model.Status);
isDeployed = false;
break;
}
}
return isDeployed;
}
// Undeploy a custom linguistic model.
public boolean undeployAsrLmModel(String modelId) {
AsrLmModel.LmModel model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
return false;
}
if (!model.Status.equals(STATUS_DEPLOYED)) {
System.out.println("Cannot undeploy model in current state: " + model.Status);
return false;
}
boolean isUnDeployed = asrLmModel.undeployAsrLmModel(modelId);
if (!isUnDeployed) {
return isUnDeployed;
}
// Poll until the model reaches the Ready state.
while (true) {
model = asrLmModel.getAsrLmModel(modelId);
if (null == model) {
isUnDeployed = false;
break;
}
if (model.Status.equals(STATUS_DEPLOYING)) {
System.out.println("Undeploying. Model ID: " + modelId);
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
else if (model.Status.equals(STATUS_READY)) {
System.out.println("Model undeployed. Model ID: " + modelId);
isUnDeployed = true;
break;
}
else {
System.out.println("Cannot undeploy model in current state: " + model.Status);
isUnDeployed = false;
break;
}
}
return isUnDeployed;
}
public static void main(String[] args) {
if (args.length < 2) {
System.err.println("AsrLmModelPopApiDemo need params: <AccessKey Id> <AccessKey Secret>");
return;
}
String accessKeyId = args[0];
String accessKeySecret = args[1];
AsrLmModelPopApiDemo demo = new AsrLmModelPopApiDemo(accessKeyId, accessKeySecret);
/******************************* Manage training datasets *******************************/
String dataId;
// Create a training dataset.
String name = "TestTrainingDataset";
String fileUrl = "https://aliyun-nls.oss-cn-hangzhou.aliyuncs.com/asr/fileASR/SLP/SLPTest.txt";
String description = "The training dataset is created by using the API.";
dataId = demo.createAsrLmData(name, fileUrl, description);
if (dataId != null) {
System.out.println("Dataset created. ID: " + dataId);
}
else {
System.out.println("Failed to create dataset.");
}
// Query a training dataset.
AsrLmData.LmData data = demo.getAsrLmData(dataId);
if (data != null) {
System.out.println("Dataset info: " + JSONObject.toJSONString(data));
}
else {
System.out.println("Failed to query dataset.");
}
// List training datasets.
AsrLmData.LmDataPage page = demo.listAsrLmData();
if (page != null) {
System.out.println("Datasets: " + JSONObject.toJSONString(page));
}
else {
System.out.println("Failed to list datasets.");
return;
}
/************************** Manage custom linguistic models *********************************/
String modelId;
// Create a custom linguistic model using the universal Chinese base model.
String modelName = "TestLinguisticModel";
String baseId = "universal";
String modelDescription = "This is a sample description.";
modelId = demo.createAsrLmModel(modelName, baseId, modelDescription);
if (modelId != null) {
System.out.println("Model created. ID: " + modelId);
}
else {
System.out.println("Failed to create model.");
}
// Query a custom linguistic model.
AsrLmModel.LmModel model = demo.getAsrLmModel(modelId);
if (model != null) {
System.out.println("Model info: " + JSONObject.toJSONString(model));
}
else {
System.out.println("Failed to query model.");
}
// List custom linguistic models.
AsrLmModel.LmModelPage modelPage = demo.listAsrLmModel();
if (modelPage != null) {
System.out.println("Models: " + JSONObject.toJSONString(modelPage));
}
else {
System.out.println("Failed to list models.");
}
/******************************* Train and deploy *******************************/
// Add a training dataset to the model.
boolean isAdded = demo.addDataToAsrLmModel(dataId, modelId);
if (isAdded) {
System.out.println("Dataset added to model.");
}
else {
System.out.println("Failed to add dataset to model.");
}
// Train the model (auto-deploys on success).
boolean isTrained = demo.trainAsrLmModel(modelId);
if (isTrained) {
System.out.println("Model trained and deployed.");
}
else {
System.out.println("Failed to train model.");
}
// Undeploy the model.
boolean isUnDeployed = demo.undeployAsrLmModel(modelId);
if (isUnDeployed) {
System.out.println("Model undeployed.");
}
else {
System.out.println("Failed to undeploy model.");
}
// Redeploy the model.
boolean isDeployed = demo.deployAsrLmModel(modelId);
if (isDeployed) {
System.out.println("Model deployed.");
}
else {
System.out.println("Failed to deploy model.");
}
/***************************** Clean up *****************************/
// 1. Undeploy the model.
isUnDeployed = demo.undeployAsrLmModel(modelId);
if (isUnDeployed) {
System.out.println("Model undeployed.");
}
else {
System.out.println("Failed to undeploy model.");
}
// 2. Remove the dataset from the model.
boolean isRemoved = demo.removeDataFromAsrLmModel(dataId, modelId);
if (isRemoved) {
System.out.println("Dataset removed from model.");
}
else {
System.out.println("Failed to remove dataset from model.");
}
// 3. Delete the dataset.
boolean isDeletedData = demo.deleteAsrLmData(dataId);
if (isDeletedData) {
System.out.println("Dataset deleted.");
}
else {
System.out.println("Failed to delete dataset.");
}
// 4. Delete the model.
boolean isDeletedModel = demo.deleteAsrLmModel(modelId);
if (isDeletedModel) {
System.out.println("Model deleted.");
}
else {
System.out.println("Failed to delete model.");
}
}
}
AsrLmData
import com.alibaba.fastjson.JSONObject;
import com.aliyuncs.CommonRequest;
import com.aliyuncs.CommonResponse;
import com.aliyuncs.IAcsClient;
import com.aliyuncs.exceptions.ClientException;
import com.aliyuncs.http.MethodType;
import com.aliyuncs.http.ProtocolType;
import java.util.ArrayList;
import java.util.List;
public class AsrLmData {
public static class LmData {
public String Name;
public String Status;
public String Description;
public String Url;
public String CreateTime;
public String UpdateTime;
public String Id;
public String ErrorMessage; // Returned only on failure.
public int Size;
}
public static class LmDataPage {
public int PageNumber;
public int PageSize;
public int TotalItems;
public int TotalPages;
public List<LmData> Content = new ArrayList<LmData>();
}
private static final String VERSION = "2018-11-20";
private static final String DOMAIN = "nls-slp.ap-southeast-1.aliyuncs.com";
private static ProtocolType PROTOCOL_TYPE = ProtocolType.HTTPS;
private static final String KEY_NAME = "Name";
private static final String KEY_URL = "Url";
private static final String KEY_DESCRIPTION = "Description";
private static final String KEY_DATA_ID = "DataId";
private static final String KEY_DATA = "Data";
private static final String KEY_PAGE = "Page";
private static final String KEY_PAGE_NUMBER = "PageNumber";
private static final String KEY_PAGE_SIZE = "PageSize";
private static final String KEY_MODEL_ID = "ModelId";
private IAcsClient client;
private CommonRequest newRequest(String action) {
CommonRequest request = new CommonRequest();
request.setDomain(DOMAIN);
request.setProtocol(PROTOCOL_TYPE);
request.setVersion(VERSION);
request.setMethod(MethodType.POST);
request.setAction(action);
return request;
}
public AsrLmData(IAcsClient client) {
this.client = client;
}
/**
* Create a training dataset.
* @param name: Required. Dataset name.
* @param fileUrl: Required. URL of the training corpus file.
* @param description: Optional. Dataset description.
* @return: Dataset ID (String).
*/
public String createAsrLmData(String name, String fileUrl, String description) {
CommonRequest request = newRequest("CreateAsrLmData");
request.putBodyParameter(KEY_NAME, name);
request.putBodyParameter(KEY_URL, fileUrl);
request.putBodyParameter(KEY_DESCRIPTION, description);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("CreateAsrLmData: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to create dataset. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String dataId = result.getString(KEY_DATA_ID);
return dataId;
}
/**
* Query a training dataset.
* @param dataId: Dataset ID.
* @return LmData: Dataset details.
*/
public LmData getAsrLmData(String dataId) {
CommonRequest request = newRequest("GetAsrLmData");
request.putBodyParameter(KEY_DATA_ID, dataId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("GetAsrLmData: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to query dataset. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String dataJson = result.getString(KEY_DATA);
LmData data = JSONObject.parseObject(dataJson, LmData.class);
return data;
}
/**
* Delete a training dataset.
* @param dataId: Required. Dataset ID.
* @return: true if deleted successfully.
*/
public boolean deleteAsrLmData(String dataId) {
CommonRequest request = newRequest("DeleteAsrLmData");
request.putBodyParameter(KEY_DATA_ID, dataId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("DeleteAsrLmData: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to delete dataset. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* List training datasets with pagination.
* @param pageNumber: Optional. Page number (default: 1).
* @param pageSize: Optional. Page size, 10-100 (default: 10).
* @param modelId: Optional. Filter by model ID.
* @return: Paginated dataset list.
*/
public LmDataPage listAsrLmData(int pageNumber, int pageSize, String modelId) {
CommonRequest request = newRequest("ListAsrLmData");
request.putBodyParameter(KEY_PAGE_NUMBER, pageNumber);
request.putBodyParameter(KEY_PAGE_SIZE, pageSize);
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("ListAsrLmData: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to list datasets. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String pageJson = result.getString(KEY_PAGE);
LmDataPage page = JSONObject.parseObject(pageJson, LmDataPage.class);
return page;
}
public LmDataPage listAsrLmData() {
return listAsrLmData(1, 10, null);
}
}
AsrLmModel
import com.alibaba.fastjson.JSONObject;
import com.aliyuncs.CommonRequest;
import com.aliyuncs.CommonResponse;
import com.aliyuncs.IAcsClient;
import com.aliyuncs.exceptions.ClientException;
import com.aliyuncs.http.MethodType;
import com.aliyuncs.http.ProtocolType;
import java.util.ArrayList;
import java.util.List;
public class AsrLmModel {
public static class LmModel {
public String Name;
public String Status;
public String Description;
public String CreateTime;
public String UpdateTime;
public String Id;
public String BaseId;
public String ErrorMessage; // Returned only on failure.
public int Size;
}
public static class LmModelPage {
public int PageNumber;
public int PageSize;
public int TotalItems;
public int TotalPages;
public List<LmModel> Content = new ArrayList<LmModel>();
}
private static final String VERSION = "2018-11-20";
private static final String DOMAIN = "nls-slp.ap-southeast-1.aliyuncs.com";
private static ProtocolType PROTOCOL_TYPE = ProtocolType.HTTPS;
private static final String KEY_NAME = "Name";
private static final String KEY_BASE_ID = "BaseId";
private static final String KEY_DESCRIPTION = "Description";
private static final String KEY_MODEL_ID = "ModelId";
private static final String KEY_MODEL = "Model";
private static final String KEY_PAGE = "Page";
private static final String KEY_PAGE_NUMBER = "PageNumber";
private static final String KEY_PAGE_SIZE = "PageSize";
private static final String KEY_DATA_ID = "DataId";
private IAcsClient client;
private CommonRequest newRequest(String action) {
CommonRequest request = new CommonRequest();
request.setDomain(DOMAIN);
request.setProtocol(PROTOCOL_TYPE);
request.setVersion(VERSION);
request.setMethod(MethodType.POST);
request.setAction(action);
return request;
}
public AsrLmModel(IAcsClient client) {
this.client = client;
}
/**
* Create a custom linguistic model.
* @param name: Required. Model name.
* @param baseId: Required. Base model ID (cannot be changed after creation).
* @param description: Optional. Model description.
* @return: Model ID (String).
*/
public String createAsrLmModel(String name, String baseId, String description) {
CommonRequest request = newRequest("CreateAsrLmModel");
request.putBodyParameter(KEY_NAME, name);
request.putBodyParameter(KEY_BASE_ID, baseId);
request.putBodyParameter(KEY_DESCRIPTION, description);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("CreateAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to create model. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String modelId = result.getString(KEY_MODEL_ID);
return modelId;
}
/**
* Query a custom linguistic model.
* @param modelId: Model ID.
* @return: Model details.
*/
public LmModel getAsrLmModel(String modelId) {
CommonRequest request = newRequest("GetAsrLmModel");
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("GetAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to query model. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String modelJson = result.getString(KEY_MODEL);
LmModel model = JSONObject.parseObject(modelJson, LmModel.class);
return model;
}
/**
* Delete a custom linguistic model.
* @param modelId: Model ID.
* @return: true if deleted successfully.
*/
public boolean deleteAsrLmModel(String modelId) {
CommonRequest request = newRequest("DeleteAsrLmModel");
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("DeleteAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to delete model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* List custom linguistic models with pagination.
* @param pageNumber: Optional. Page number (default: 1).
* @param pageSize: Optional. Page size, 10-100 (default: 10).
* @param dataId: Optional. Filter by dataset ID.
* @return: Paginated model list.
*/
public LmModelPage listAsrLmModel(int pageNumber, int pageSize, String dataId) {
CommonRequest request = newRequest("ListAsrLmModel");
request.putBodyParameter(KEY_PAGE_NUMBER, pageNumber);
request.putBodyParameter(KEY_PAGE_SIZE, pageSize);
request.putBodyParameter(KEY_DATA_ID, dataId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("ListAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to list models. HTTP status: " + response.getHttpStatus());
return null;
}
JSONObject result = JSONObject.parseObject(response.getData());
String pageJson = result.getString(KEY_PAGE);
LmModelPage page = JSONObject.parseObject(pageJson, LmModelPage.class);
return page;
}
public LmModelPage listAsrLmModel() {
return listAsrLmModel(1, 10, null);
}
/**
* Add a training dataset to a custom linguistic model.
* @param dataId: Dataset ID.
* @param modelId: Model ID.
* @return: true if added successfully.
*/
public boolean addDataToAsrLmModel(String dataId, String modelId) {
CommonRequest request = newRequest("AddDataToAsrLmModel");
request.putBodyParameter(KEY_DATA_ID, dataId);
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("AddDataToAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to add dataset to model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* Remove a training dataset from a custom linguistic model.
* @param dataId: Dataset ID.
* @param modelId: Model ID.
* @return: true if removed successfully.
*/
public boolean removeDataFromAsrLmModel(String dataId, String modelId) {
CommonRequest request = newRequest("RemoveDataFromAsrLmModel");
request.putBodyParameter(KEY_DATA_ID, dataId);
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("RemoveDataFromAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to remove dataset from model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* Train a custom linguistic model.
* @param modelId: Model ID.
* @return: true if training started successfully.
*/
public boolean trainAsrLmModel(String modelId) {
CommonRequest request = newRequest("TrainAsrLmModel");
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("TrainAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to train model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* Deploy a custom linguistic model.
* @param modelId: Model ID.
* @return: true if deployment started successfully.
*/
public boolean deployAsrLmModel(String modelId) {
CommonRequest request = newRequest("DeployAsrLmModel");
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("DeployAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to deploy model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
/**
* Undeploy a custom linguistic model.
* @param modelId: Model ID.
* @return: true if undeployment started successfully.
*/
public boolean undeployAsrLmModel(String modelId) {
CommonRequest request = newRequest("UndeployAsrLmModel");
request.putBodyParameter(KEY_MODEL_ID, modelId);
CommonResponse response = null;
try {
response = client.getCommonResponse(request);
} catch (ClientException e) {
e.printStackTrace();
}
System.out.println("UndeployAsrLmModel: " + response.getData());
if (response == null || response.getHttpStatus() != 200) {
System.out.println(response.getData());
System.out.println("Failed to undeploy model. HTTP status: " + response.getHttpStatus());
return false;
}
return true;
}
}