Alibaba Cloud Speech Synthesis API reference-Intelligent Speech Interaction(ISI)-阿里云帮助中心

Speech Synthesis converts input text into binary audio data. This topic provides general information that applies to the SDK documents in this directory.

← Back to the Speech Synthesis product page

Billing and concurrency limits

Speech Synthesis offers two billing modes: Trial Edition and Commercial Edition. For more information, see Trial Edition and Commercial Edition. To upgrade from the Trial Edition to the Commercial Edition, see Upgrade from the Trial Edition to the Commercial Edition.
For more information about billing methods, see Billing methods.
For more information about concurrency limits, see Concurrency and QPS.

Features

Supports PCM, WAV, and MP3 audio encoding formats.
Supports adjustments for speech rate, pitch, and volume.
Supports various voices for different scenarios and styles. For more information, see Voice list.
Synthesizes up to 300 characters at a time. A Chinese character, an English letter, a punctuation mark, or a space between words is counted as one character. Text that exceeds 300 characters is truncated.
Supports only UTF-8 encoded text.
Supports multi-emotion voices. For more information, see the <emotion> tag in SSML markup language. Tags are not counted as characters.

Note

Word-level phoneme boundaries: When Speech Synthesis outputs audio, it can also return the timestamp for each Chinese character or English word in the audio. You can use this timestamp information to drive the lip movements of a virtual human or create captions for video dubbing. For more information, see Speech Synthesis timestamp feature.
For information about voices for literary scenarios, see API reference.
To use the Android or iOS SDK, see API reference for mobile devices.

Intelligent access to the nearest region

Speech Synthesis supports intelligent access to the nearest region using the domain name nls-gateway.aliyuncs.com.

We recommend that end users connect to the nearest region. The system automatically resolves the domain name to the server in the nearest region based on the client's location. For example, a request from Beijing is routed to a server in the China (Beijing) region, which is equivalent to using the domain name nls-gateway-cn-beijing.aliyuncs.com.

Endpoint

Access type	Description	URL
Public access (defaults to the China (Shanghai) region)	All servers can use the public access URL. The public access URL is set by default in the SDK.	China (Shanghai): `wss://nls-gateway-cn-shanghai.aliyuncs.com/ws/v1` China (Beijing): `wss://nls-gateway-cn-beijing.aliyuncs.com/ws/v1` China (Shenzhen): `wss://nls-gateway-cn-shenzhen.aliyuncs.com/ws/v1`
Internal access from an ECS instance	If you use an Alibaba Cloud ECS instance in the China (Shanghai), China (Beijing), or China (Shenzhen) region, you can use the internal access URL. ECS instances in the classic network cannot access AnyTunnel. This means you cannot access Voice Service over the internal network. To use AnyTunnel, create a VPC and access the service from within the VPC. Note Internal access does not incur public data transfer costs for the ECS instance. For more information about network types, see Network types.	China (Shanghai): `ws://nls-gateway-cn-shanghai-internal.aliyuncs.com:80/ws/v1` China (Beijing): `ws://nls-gateway-cn-beijing-internal.aliyuncs.com:80/ws/v1` China (Shenzhen): `ws://nls-gateway-cn-shenzhen-internal.aliyuncs.com:80/ws/v1`

Interaction flow

Note

The preceding figure shows the interaction flow for WebSocket. For the interaction flow of the RESTful API, see RESTful API.
In addition to the audio stream, the server response header includes the `task_id` parameter, which is the unique identifier for the request.
If you want to play the audio stream returned by the server in real time, use an audio player that supports stream playback, such as FFmpeg, PyAudio (Python), AudioFormat (Java), and MediaSource (JavaScript).

Authentication
When the client establishes a WebSocket connection with the server, it uses a token for authentication. For more information about how to obtain a token, see Obtain a token.

Start synthesis

The client sends a speech synthesis request. The following table describes the request parameters.

Parameter	Type	Required	Description
appkey	String	Yes	The AppKey of the project that you created in the console.
text	String	Yes	The text to synthesize. The text must be UTF-8 encoded and cannot exceed 300 characters. Add a space between English words. Note To call the multi-emotion feature of a voice, add the ssml-emotion tag to the text. For more information, see <emotion>. If you use the <emotion> tag for a voice that does not support multiple emotions, the `Illegal ssml text` error is reported.
voice	String	No	The voice that is used for synthesis. Default value: `xiaoyun`.
format	String	No	The audio encoding format. Supported formats are .pcm, .wav, and .mp3. Default value: `pcm`.
sample_rate	Integer	No	The audio sample rate. Default value: 16000 Hz.
volume	Integer	No	The volume. Valid values: 0 to 100. Default value: 50.
speech_rate	Integer	No	The speech rate. Valid values: -500 to 500. Default value: 0. The value range [-500, 0, 500] corresponds to a speed multiplier range of [0.5, 1.0, 2.0]. -500 indicates 0.5 times the default speed. 0 indicates the default speed (1.0×). The default speed is the synthesis speed of the model output, which varies slightly for each voice, at approximately four characters per second. 500 indicates 2.0 times the default speed. The calculation method is as follows: 0.8× speed: (1 - 1/0.8) / 0.002 = -125 1.2× speed: (1 - 1/1.2) / 0.001 = 166 Note For speeds less than 1.0×, a coefficient of 0.002 is used. For speeds greater than 1.0×, a coefficient of 0.001 is used. The actual algorithm result is an approximate value.
pitch_rate	Integer	No	The pitch. Valid values: -500 to 500. Default value: 0.
enable_subtitle	Boolean	No	Enables word-level timestamps. For more information, see Speech Synthesis timestamp feature.

Receive synthesized data
The server returns the synthesized audio as binary data. The SDK receives and processes the binary data.
End synthesis
After the synthesis is complete, the server sends a SynthesisCompleted event notification. The following code provides an example.
```
{
    "header": {
        "message_id": "05450bf69c53413f8d88aed1ee60****",
        "task_id": "640bc797bb684bd6960185651307****",
        "namespace": "SpeechSynthesizer",
        "name": "SynthesisCompleted",
        "status": 20000000,
        "status_message": "GATEWAY|SUCCESS|Success."
    }
}
```
Note
The examples in the documentation save the synthesized audio to a file. To play the audio in real time with low latency, we recommend that you use stream playback. This lets you play the audio data as you receive it, which reduces delay.

Handle synthesis failures

If the synthesis task fails, you will receive a TaskFailed notification. The following code provides an example. After you receive a TaskFailed notification, the underlying connection is closed.

{
   "header":{
      "namespace":"Default",
      "name":"TaskFailed",
      "status":41020001,
      "message_id":"62c126f7d9b340deb82b5b7eaca0****",
      "task_id":"4552df26d1f547aab9a2c4a94678****",
      "status_text":"TTS:TtsClientError:[tts]Engine return error code: 418"
   }
}

Voice list

Name	voice parameter value	Type	Scenario	Supported languages	Supported sample rates (Hz)	Supports word/sentence-level timestamps	Supports retroflex finals	Voice quality
Abin	abin	Mandarin with Cantonese accent	Conversational digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	No	No	Standard Edition
Zhixiaobai	zhixiaobai	Mandarin female voice	Conversational digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	No	Yes	Standard Edition
Zhixiaoxia	zhixiaoxia	Mandarin female voice	Conversational digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	No	Yes	Standard Edition
Zhixiaomei	zhixiaomei	Mandarin female voice	Live streaming digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K	Yes	Yes	Standard Edition
Zhigui	zhigui	Mandarin female voice	Live streaming digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Zhishuo	zhishuo	Mandarin male voice	Customer service digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Aixia	aixia	Mandarin female voice	Customer service digital human	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Cally	cally	American English female voice	Spoken English conversational digital human	English-only scenarios	8K/16K	Yes	Yes	Standard Edition
Zhifeng_emo	zhifeng_emo	Multi-emotion male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	Yes	Yes	Standard Edition
Zhibing_emo	zhibing_emo	Multi-emotion male voice	General scenarios	Chinese-only scenarios	8K/16K/24K	Yes	Yes	Standard Edition
Zhimiao_emo	zhimiao_emo	Multi-emotion female voice	Chinese-English scenarios	Chinese and English scenarios	8K/16K	Yes	Yes	Standard Edition
Zhimi_emo	zhimi_emo	Multi-emotion female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Zhiyan_emo	zhiyan_emo	Multi-emotion female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Zhibei_emo	zhibei_emo	Multi-emotion child voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Zhitian_emo	zhitian_emo	Multi-emotion female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Xiaoyun	xiaoyun	Standard female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	No	No	Lite Edition
Xiaogang	xiaogang	Standard male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	No	No	Lite Edition
Ruoxi	ruoxi	Gentle female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	No	No	Standard Edition
Siqi	siqi	Gentle female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	Yes	No	Standard Edition
Sijia	sijia	Standard female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	No	No	Standard Edition
Sicheng	sicheng	Standard male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	Yes	No	Standard Edition
Aiqi	aiqi	Gentle female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aijia	aijia	Standard female voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aicheng	aicheng	Standard male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aida	aida	Standard male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Ninger	ninger	Standard female voice	General scenarios	Chinese-only scenarios	8K/16K/24K	No	No	Standard Edition
Ruilin	ruilin	Standard female voice	General scenarios	Chinese-only scenarios	8K/16K/24K	No	No	Standard Edition
Siyue	siyue	Gentle female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	Yes	No	Standard Edition
Aiya	aiya	Stern female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aimei	aimei	Sweet female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aiyu	aiyu	Natural female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aiyue	aiyue	Gentle female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Aijing	aijing	Stern female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Xiaomei	xiaomei	Sweet female voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K/24K	No	No	Standard Edition
Aina	aina	Female voice with Zhejiang accent	Customer service scenarios	Chinese-only scenarios	8K/16K	Yes	No	Standard Edition
Yina	yina	Female voice with Zhejiang accent	Customer service scenarios	Chinese-only scenarios	8K/16K/24K	No	No	Standard Edition
Sijing	sijing	Stern female voice	Customer service scenarios	Chinese-only scenarios	8K/16K/24K	Yes	No	Standard Edition
Sitong	sitong	Child voice	Child voice scenarios	Chinese-only scenarios	8K/16K/24K	No	No	Standard Edition
Xiaobei	xiaobei	Lolita female voice	Child voice scenarios	Chinese-only scenarios	8K/16K/24K	Yes	No	Standard Edition
Aitong	aitong	Child voice	Child voice scenarios	Chinese-only scenarios	8K/16K	Yes	No	Standard Edition
Aiwei	aiwei	Lolita female voice	Child voice scenarios	Chinese-only scenarios	8K/16K	Yes	No	Standard Edition
Aibao	aibao	Lolita female voice	Child voice scenarios	Chinese-only scenarios	8K/16K	Yes	No	Standard Edition
Harry	harry	British English male voice	English scenarios	English scenarios	8K/16K	No	No	Standard Edition
Abby	abby	American English female voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Andy	andy	American English male voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Eric	eric	British English male voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Emily	emily	British English female voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Luna	luna	British English female voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Luca	luca	British English male voice	English scenarios	English scenarios	8K/16K	Yes	No	Standard Edition
Wendy	wendy	British English female voice	English scenarios	English scenarios	8K/16K/24K	No	No	Standard Edition
William	william	British English male voice	English scenarios	English scenarios	8K/16K/24K	No	No	Standard Edition
Olivia	olivia	British English female voice	English scenarios	English scenarios	8K/16K/24K	No	No	Standard Edition
Shanshan	shanshan	Cantonese female voice	Dialect scenarios	Standard Cantonese (Simplified) and Cantonese-English mixed scenarios	8K/16K/24K	No	No	Standard Edition
Aiyuan	aiyuan	Confidante	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aiying	aiying	Cute and soft child voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aixiang	aixiang	Magnetic male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aimo	aimo	Emotional male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aiye	aiye	Young male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aiting	aiting	Radio-style female voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aifan	aifan	Emotional female voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Lydia	lydia	Bilingual English-Chinese female voice	English scenarios	English and English-Chinese mixed scenarios	8K/16K	Yes	No	Standard Edition
Xiaoyue	chuangirl	Sichuanese female voice	Dialect scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	No	No	Standard Edition
Aishuo	aishuo	Natural male voice	Customer service scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Qingqing	qingqing	Taiwan (China) Mandarin female voice	Dialect scenarios	Chinese-only scenarios	8K/16K	No	No	Standard Edition
Cuijie	cuijie	Northeastern Mandarin female voice	Dialect scenarios	Chinese-only scenarios	8K/16K	Yes	Yes	Standard Edition
Xiaoze	xiaoze	Male voice with a strong Hunan accent	Dialect scenarios	Chinese-only scenarios	8K/16K	No	No	Standard Edition
Ainan	ainan	Advertisement-style male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aihao	aihao	News-style male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aiming	aiming	Humorous male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aixiao	aixiao	News-style female voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aichu	aichu	Food documentary-style male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Aiqian	aiqian	News-style female voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Zhi Xiang	tomoka	Japanese female voice	Multilingual scenarios	Japanese-only scenarios	8K/16K	Yes	No	Standard Edition
Tomoya	tomoya	Japanese male voice	Multilingual scenarios	Japanese-only scenarios	8K/16K	Yes	No	Standard Edition
Annie	annie	American English female voice	English scenarios	English-only scenarios	8K/16K	Yes	No	Standard Edition
Aishu	aishu	News-style male voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Airu	airu	Newscast female voice	Literary scenarios	Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Premium Edition
Jiajia	jiajia	Cantonese female voice	Dialect scenarios	Standard Cantonese (Simplified) and Cantonese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Indah	indah	Indonesian female voice	Multilingual scenarios	Indonesian-only scenarios	8K/16K	No	No	Standard Edition
Peach	taozi	Cantonese female voice	Dialect scenarios	Supports Standard Cantonese (Simplified) and Cantonese-English mixed scenarios	8K/16K	Yes	No	Standard Edition
Guijie	guijie	Friendly female voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Stella	stella	Intellectual female voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Stanley	stanley	Calm male voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Kenny	kenny	Calm male voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Rosa	rosa	Natural female voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Farah	farah	Malay female voice	Multilingual scenarios	Malay-only scenarios	8K/16K	No	No	Standard Edition
Mashu	mashu	Children's drama male voice	General scenarios	General scenarios	8K/16K	Yes	No	Standard Edition
Zhiqi	zhiqi	Gentle female voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhichu	zhichu	Food documentary-style male voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	Yes	Premium Edition
Xiaoxian	xiaoxian	Friendly female voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Yuer	yuer	Children's drama female voice	General scenarios	Chinese-only scenarios	8K/16K	Yes	No	Standard Edition
Maoxiaomei	maoxiaomei	Energetic female voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Zhixiang	zhixiang	Magnetic male voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhijia	zhijia	Standard female voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhinan	zhinan	Advertisement-style male voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhiqian	zhiqian	News-style female voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhiru	zhiru	Newscast female voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhide	zhide	Newscast male voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Zhifei	zhifei	Passionate commentary voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Premium Edition
Aifei	aifei	Passionate commentary voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Yaqun	yaqun	Store broadcast voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Qiaowei	qiaowei	Store broadcast voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Dahu	dahu	Northeastern Mandarin male voice	Dialect scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
ava	ava	American girl	English scenarios	English-only scenarios	8K/16K	Yes	No	Standard Edition
Zhilun	zhilun	Suspense commentary voice	Ultra-high definition scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Premium Edition
Ailun	ailun	Suspense commentary voice	Live streaming scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	Yes	Standard Edition
Jielidou	jielidou	Soothing child voice	Child voice scenarios	Chinese-only scenarios	8K/16K	Yes	Yes	Standard Edition
Zhiwei	zhiwei	Lolita female voice	Ultra-high definition scenarios	Chinese-only scenarios	8K/16K/24K/48K	Yes	No	Premium Edition
Laotie	laotie	Buddy from Northeast China	Live streaming scenarios	Chinese-only scenarios	8K/16K	Yes	Yes	Standard Edition
Laomei	laomei	Female Hawking Voice	Live streaming scenarios	Chinese-only scenarios	8K/16K	Yes	Yes	Standard Edition
Aikan	aikan	Tianjin dialect male voice	Dialect scenarios	Chinese-only scenarios	8K/16K	Yes	Yes	Standard Edition
Tala	tala	Filipino female voice	Multilingual scenarios	Filipino-only scenarios	8K/16K	No	No	Standard Edition
Zhitian	zhitian	Sweet female voice	General scenarios	Supports Chinese and Chinese-English mixed scenarios	8K/16K	Yes	No	Premium Edition
Zhiqing	zhiqing	A girl speaking a dialect from Taiwan (China)	Dialect scenarios	Chinese-only scenarios	8K/16K	Yes	No	Premium Edition
Tien	tien	Vietnamese female voice	Multilingual scenarios	Vietnamese-only scenarios	8K/16K	No	No	Standard Edition
Becca	becca	American English customer service female voice	American English	English-only scenarios	8K/16K	No	No	Standard Edition
Kyong	Kyong	Korean female voice	Korean scenarios	Korean	8K/16K	No	No	Standard Edition
masha	masha	Russian female voice	Russian scenarios	Russian	8K/16K	No	No	Standard Edition
camila	camila	Spanish female voice	Spanish scenarios	Spanish	8k/16k	No	No	Standard Edition
perla	perla	Italian female voice	Italian Scenario	Italian	8k/16k	No	No	Standard Edition
Zhimao	zhimao	Mandarin female voice	Live streaming	Chinese	8k/16k	Yes	No	Standard Edition
Zhiyuan	zhiyuan	Mandarin female voice	General scenarios	Chinese	8k/16k	Yes	No	Standard Edition
Zhiya	zhiya	Mandarin female voice	Customer service	Chinese	8k/16k	Yes	No	Standard Edition
Zhiyue	zhiyue	Mandarin female voice	General scenarios	Chinese	8k/16k	Yes	No	Standard Edition
Zhida	zhida	Mandarin male voice	General scenarios	Chinese and Chinese-English mixed scenarios	8k/16k	Yes	No	Standard Edition
Zhistella	zhistella	Mandarin female voice	General scenarios	Chinese	8k/16k	Yes	No	Standard Edition
Kelly	kelly	Hong Kong Cantonese female voice	Dialect scenarios	Hong Kong Cantonese	8k/16k	Yes	No	Standard Edition
clara	clara	French female voice	General scenarios	French	8k/16k	No	No	Standard Edition
hanna	hanna	German female voice	General scenarios	German	8k/16k	No	No	Standard Edition
waan	waan	Thai female voice	General scenarios	Thai	8k/16k	No	No	Standard Edition
betty	betty	American English female voice	General scenarios	American English	8k/16k	Yes	No	Standard Edition
beth	beth	American English female voice	General scenarios	American English	8k/16k	Yes	No	Standard Edition
cindy	cindy	American English female voice	General scenarios	American English	8k/16k	Yes	No	Standard Edition
donna	donna	American English female voice	General scenarios	American English	8k/16k	Yes	No	Standard Edition
eva	eva	American English female voice	General scenario	American English	8k/16k	Yes	No	Standard Edition
brian	brian	American English male voice	General scenarios	American English	8k/16k	Yes	No	Standard Edition
david	david	American English male voice	General scenarios	American English	8k/16k/24k	Yes	No	Standard Edition
abby_ecmix	abby_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
annie_ecmix	annie_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
andy_ecmix	andy_ecmix	American English male voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
ava_ecmix	ava_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
betty_ecmix	betty_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
beth_ecmix	beth_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
brian_ecmix	brian_ecmix	American English male voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
cindy_ecmix	cindy_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
cally_ecmix	cally_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
donna_ecmix	donna_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
david_ecmix	david_ecmix	American English male voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition
eva_ecmix	eva_ecmix	American English female voice	General scenarios	English and English-Chinese mixed scenarios	8k/16k/24k	Yes	No	Standard Edition

Support for multi-emotion voices

Only multi-emotion voice models support emotion selection. The following table lists the supported emotions. The supported emotion categories vary by voice. The main categories include the following: neutral, happy, angry, sad, fear, hate, surprise, arousal, serious, disgust, jealousy, embarrassed, frustrated, affectionate, gentle, newscast, customer-service, story, and living.

Voice name	voice parameter value	Emotion category
Zhifeng_emo	zhifeng_emo	angry, fear, happy, neutral, sad, surprise
Zhibing_emo	zhibing_emo	angry, fear, happy, neutral, sad, surprise
Zhimiao_emo	zhimiao_emo	serious, sad, disgust, jealousy, embarrassed, happy, fear, surprise, neutral, frustrated, affectionate, gentle, angry, newscast, customer-service, story, living
Zhimi_emo	zhimi_emo	angry, fear, happy, hate, neutral, sad, surprise
Zhiyan-Multi-Emotion	zhiyan_emo	neutral, happy, angry, sad, fear, hate, surprise, arousal
Zhibei_emo	zhibei_emo	neutral, happy, angry, sad, fear, hate, surprise
Zhitian_emo	zhitian_emo	neutral, happy, angry, sad, fear, hate, surprise

Service status codes

Each service response contains a `status` field, which is the service status code. The following tables describe the status codes.

General-purpose error codes

Status code	Status message	Cause	Solution
40000000	The default client error code. This code corresponds to multiple error messages.	Invalid parameters or call logic was used.	Compare your code with the sample code in the official documentation to test and verify it.
40000001	The token 'xxx' has expired. The token 'xxx' is invalid	Invalid parameters or call logic was used. This is a general-purpose client error code that usually indicates an incorrect token, such as an expired or invalid token.	Compare your code with the sample code in the official documentation to test and verify it.
40000002	Gateway:MESSAGE_INVALID:Can't process message in state'FAILED'!	The message is invalid or incorrect.	Compare your code with the sample code in the official documentation to test and verify it.
40000003	PARAMETER_INVALID Failed to decode url params	The parameters passed by the user are incorrect. This error is common for RESTful API calls.	Compare your code with the sample code in the official documentation to test and verify it.
40000005	Gateway:TOO_MANY_REQUESTS:Too many requests!	Too many concurrent requests.	If you are using the Free Edition, you can upgrade to a commercial version to increase the concurrency. If you are already using a commercial version, you can purchase a concurrency resource plan to increase your concurrency quota.
40000009	Invalid wav header!	The message header is invalid.	If you send a WAV audio file and set the `format` parameter to `wav`, check whether the WAV header of the audio file is correct. If the header is incorrect, the server may reject the request.
40000009	Too large wav header!	The WAV header of the transmitted audio is invalid.	You can send the audio stream in a format such as PCM or OPUS. If you use the WAV format, make sure that the WAV header of the audio file contains the correct data length.
40000010	Gateway:FREE_TRIAL_EXPIRED:The free trial has expired!	The trial period has ended, and the commercial version is not activated or your account has an overdue payment.	You can log on to the console to check the service activation status and your account balance.
40010001	Gateway:NAMESPACE_NOT_FOUND:RESTful url path illegal	The operation or parameter is not supported.	Check whether the parameters passed in the call are consistent with the requirements in the official documentation. You can compare them with the error message to identify and set the correct parameters. For example, if you are using a curl command to make a RESTful API request, check whether the URL you constructed is valid.
40010003	Gateway:DIRECTIVE_INVALID:[xxx]	A general-purpose client-side error code.	This error indicates that the client passed an incorrect parameter or instruction. Detailed error messages are available for different operations. You can refer to the corresponding documentation to set the parameters correctly.
40010004	Gateway:CLIENT_DISCONNECT:Client disconnected before task finished!	The client actively terminated the connection before the request was processed.	None. Alternatively, you can close the connection after the server responds.
40010005	Gateway:TASK_STATE_ERROR:Got stop directive while task is stopping!	The client sent a message instruction that is not currently supported.	Compare your code with the sample code in the official documentation to test and verify it.
40020105	Meta:APPKEY_NOT_EXIST:Appkey not exist!	A non-existent Appkey was used.	Confirm whether a non-existent Appkey was used. You can log on to the console and view the project configuration to find the Appkey.
40020106	Meta:APPKEY_UID_MISMATCH:Appkey and user mismatch!	The Appkey and token passed in the call were not created by the same Alibaba Cloud account UID. This causes a mismatch.	Check whether you are using resources from two different accounts. Do not use an Appkey from Account A with a token generated from Account B.
403	Forbidden	The token is invalid. For example, the token does not exist or has expired.	Set a valid token. Tokens have an expiration period. You must obtain a new token before the current one expires.
41000003	MetaInfo doesn't have end point info	Failed to retrieve the routing information for this Appkey.	Check whether you are using resources from two different accounts. Do not use an Appkey from Account A with a token generated from Account B.
41010101	UNSUPPORTED_SAMPLE_RATE	The sample rate is not supported.	Real-time speech recognition currently supports only audio with a sample rate of 8000 Hz or 16000 Hz.
41040201	Realtime:GET_CLIENT_DATA_TIMEOUT:Client data does not send continuously!	Failed to retrieve data from the client due to a timeout.	When you call real-time speech recognition, the client must send data at a real-time rate and close the connection promptly after the data is sent.
50000000	GRPC_ERROR:Grpc error!	An exception caused by factors such as machine load or network issues. This error usually occurs randomly.	You can retry the call to resolve the issue.
50000001	GRPC_ERROR:Grpc error!	An exception caused by factors such as machine load or network issues. This error usually occurs randomly.	You can retry the call to resolve the issue.
52010001	GRPC_ERROR:Grpc error!	An exception caused by factors such as machine load or network issues. This error usually occurs randomly.	You can retry the call to resolve the issue.

Speech synthesis/Long-text speech synthesis error codes

Status code	Status message	Cause	Solution
40000001	Gateway:ACCESS_DENIED:No privilege to this voice!	An incorrect speaker name was set.	You can refer to the official documentation to set the correct speaker.
40000004	Gateway:IDLE_TIMEOUT:Websocket session is idle for too long time,the last directive is 'StartSynthesis'!	After a connection is established, the server returns this error message if no data is sent for more than 10 seconds.	Close the connection promptly after the request is processed. This error may also occur if the server is under high instantaneous pressure and cannot return data in time. In this case, you can retry the request to resolve the issue.
40010003	Gateway:DIRECTIVE_INVALID:No text specified!	No valid text for synthesis was set.	You can refer to the sample code in the official documentation to set the text for synthesis.
41020001	Speech synthesis client error	Multiple error messages may be returned. Adjust your code based on the specific error message.	If the message `Engine return error code: 424.` is returned, the background music or concatenated recording does not conform to the required format. You can set the correct background music as described in the documentation. If the message `Engine return error code:418` is returned, an unsupported speaker name was passed. If the message `Engine return error code: 413` is returned, the SSML format used is incorrect. If the message `Request json illegal,failed to parse request.` is returned, the passed JSON format is invalid. If the message `SSML text length should be less than 300.` is returned, the synthesis text is too long. You must use the long-text speech synthesis operation.
51020001	TTS:TtsServerError	An exception caused by factors such as machine load or network issues. This error usually occurs randomly.	You can retry the call to resolve the issue.