Speech Synthesis converts input text into binary audio data. This topic provides general information that applies to the SDK documents in this directory.
← Back to the Speech Synthesis product page
Billing and concurrency limits
Speech Synthesis offers two billing modes: Trial Edition and Commercial Edition. For more information, see Trial Edition and Commercial Edition. To upgrade from the Trial Edition to the Commercial Edition, see Upgrade from the Trial Edition to the Commercial Edition.
For more information about billing methods, see Billing methods.
For more information about concurrency limits, see Concurrency and QPS.
Features
Supports PCM, WAV, and MP3 audio encoding formats.
Supports adjustments for speech rate, pitch, and volume.
Supports various voices for different scenarios and styles. For more information, see Voice list.
Synthesizes up to 300 characters at a time. A Chinese character, an English letter, a punctuation mark, or a space between words is counted as one character. Text that exceeds 300 characters is truncated.
Supports only UTF-8 encoded text.
Supports multi-emotion voices. For more information, see the <emotion> tag in SSML markup language. Tags are not counted as characters.
Word-level phoneme boundaries: When Speech Synthesis outputs audio, it can also return the timestamp for each Chinese character or English word in the audio. You can use this timestamp information to drive the lip movements of a virtual human or create captions for video dubbing. For more information, see Speech Synthesis timestamp feature.
For information about voices for literary scenarios, see API reference.
To use the Android or iOS SDK, see API reference for mobile devices.
Intelligent access to the nearest region
Speech Synthesis supports intelligent access to the nearest region using the domain name nls-gateway.aliyuncs.com.
We recommend that end users connect to the nearest region. The system automatically resolves the domain name to the server in the nearest region based on the client's location. For example, a request from Beijing is routed to a server in the China (Beijing) region, which is equivalent to using the domain name nls-gateway-cn-beijing.aliyuncs.com.
Endpoint
Access type | Description | URL |
Public access (defaults to the China (Shanghai) region) | All servers can use the public access URL. The public access URL is set by default in the SDK. |
|
Internal access from an ECS instance | If you use an Alibaba Cloud ECS instance in the China (Shanghai), China (Beijing), or China (Shenzhen) region, you can use the internal access URL. ECS instances in the classic network cannot access AnyTunnel. This means you cannot access Voice Service over the internal network. To use AnyTunnel, create a VPC and access the service from within the VPC. Note
|
|
Interaction flow
The preceding figure shows the interaction flow for WebSocket. For the interaction flow of the RESTful API, see RESTful API.
In addition to the audio stream, the server response header includes the `task_id` parameter, which is the unique identifier for the request.
If you want to play the audio stream returned by the server in real time, use an audio player that supports stream playback, such as FFmpeg, PyAudio (Python), AudioFormat (Java), and MediaSource (JavaScript).
Authentication
When the client establishes a WebSocket connection with the server, it uses a token for authentication. For more information about how to obtain a token, see Obtain a token.
Start synthesis
The client sends a speech synthesis request. The following table describes the request parameters.
Parameter
Type
Required
Description
appkey
String
Yes
The AppKey of the project that you created in the console.
text
String
Yes
The text to synthesize. The text must be UTF-8 encoded and cannot exceed 300 characters. Add a space between English words.
NoteTo call the multi-emotion feature of a voice, add the ssml-emotion tag to the text. For more information, see <emotion>.
If you use the <emotion> tag for a voice that does not support multiple emotions, the `Illegal ssml text` error is reported.
voice
String
No
The voice that is used for synthesis. Default value:
xiaoyun.format
String
No
The audio encoding format. Supported formats are .pcm, .wav, and .mp3. Default value:
pcm.sample_rate
Integer
No
The audio sample rate. Default value: 16000 Hz.
volume
Integer
No
The volume. Valid values: 0 to 100. Default value: 50.
speech_rate
Integer
No
The speech rate. Valid values: -500 to 500. Default value: 0.
The value range [-500, 0, 500] corresponds to a speed multiplier range of [0.5, 1.0, 2.0].
-500 indicates 0.5 times the default speed.
0 indicates the default speed (1.0×). The default speed is the synthesis speed of the model output, which varies slightly for each voice, at approximately four characters per second.
500 indicates 2.0 times the default speed.
The calculation method is as follows:
0.8× speed: (1 - 1/0.8) / 0.002 = -125
1.2× speed: (1 - 1/1.2) / 0.001 = 166
NoteFor speeds less than 1.0×, a coefficient of 0.002 is used.
For speeds greater than 1.0×, a coefficient of 0.001 is used.
The actual algorithm result is an approximate value.
pitch_rate
Integer
No
The pitch. Valid values: -500 to 500. Default value: 0.
enable_subtitle
Boolean
No
Enables word-level timestamps. For more information, see Speech Synthesis timestamp feature.
Receive synthesized data
The server returns the synthesized audio as binary data. The SDK receives and processes the binary data.
End synthesis
After the synthesis is complete, the server sends a SynthesisCompleted event notification. The following code provides an example.
{ "header": { "message_id": "05450bf69c53413f8d88aed1ee60****", "task_id": "640bc797bb684bd6960185651307****", "namespace": "SpeechSynthesizer", "name": "SynthesisCompleted", "status": 20000000, "status_message": "GATEWAY|SUCCESS|Success." } }NoteThe examples in the documentation save the synthesized audio to a file. To play the audio in real time with low latency, we recommend that you use stream playback. This lets you play the audio data as you receive it, which reduces delay.
Handle synthesis failures
If the synthesis task fails, you will receive a TaskFailed notification. The following code provides an example. After you receive a TaskFailed notification, the underlying connection is closed.
{ "header":{ "namespace":"Default", "name":"TaskFailed", "status":41020001, "message_id":"62c126f7d9b340deb82b5b7eaca0****", "task_id":"4552df26d1f547aab9a2c4a94678****", "status_text":"TTS:TtsClientError:[tts]Engine return error code: 418" } }
Voice list
Name | voice parameter value | Type | Scenario | Supported languages | Supported sample rates (Hz) | Supports word/sentence-level timestamps | Supports retroflex finals | Voice quality |
Abin | abin | Mandarin with Cantonese accent | Conversational digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | No | No | Standard Edition |
Zhixiaobai | zhixiaobai | Mandarin female voice | Conversational digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | No | Yes | Standard Edition |
Zhixiaoxia | zhixiaoxia | Mandarin female voice | Conversational digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | No | Yes | Standard Edition |
Zhixiaomei | zhixiaomei | Mandarin female voice | Live streaming digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K | Yes | Yes | Standard Edition |
Zhigui | zhigui | Mandarin female voice | Live streaming digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Zhishuo | zhishuo | Mandarin male voice | Customer service digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Aixia | aixia | Mandarin female voice | Customer service digital human | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Cally | cally | American English female voice | Spoken English conversational digital human | English-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Zhifeng_emo | zhifeng_emo | Multi-emotion male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | Yes | Yes | Standard Edition |
Zhibing_emo | zhibing_emo | Multi-emotion male voice | General scenarios | Chinese-only scenarios | 8K/16K/24K | Yes | Yes | Standard Edition |
Zhimiao_emo | zhimiao_emo | Multi-emotion female voice | Chinese-English scenarios | Chinese and English scenarios | 8K/16K | Yes | Yes | Standard Edition |
Zhimi_emo | zhimi_emo | Multi-emotion female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Zhiyan_emo | zhiyan_emo | Multi-emotion female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Zhibei_emo | zhibei_emo | Multi-emotion child voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Zhitian_emo | zhitian_emo | Multi-emotion female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Xiaoyun | xiaoyun | Standard female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | No | No | Lite Edition |
Xiaogang | xiaogang | Standard male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | No | No | Lite Edition |
Ruoxi | ruoxi | Gentle female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | No | No | Standard Edition |
Siqi | siqi | Gentle female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | Yes | No | Standard Edition |
Sijia | sijia | Standard female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | No | No | Standard Edition |
Sicheng | sicheng | Standard male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | Yes | No | Standard Edition |
Aiqi | aiqi | Gentle female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aijia | aijia | Standard female voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aicheng | aicheng | Standard male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aida | aida | Standard male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Ninger | ninger | Standard female voice | General scenarios | Chinese-only scenarios | 8K/16K/24K | No | No | Standard Edition |
Ruilin | ruilin | Standard female voice | General scenarios | Chinese-only scenarios | 8K/16K/24K | No | No | Standard Edition |
Siyue | siyue | Gentle female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | Yes | No | Standard Edition |
Aiya | aiya | Stern female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aimei | aimei | Sweet female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aiyu | aiyu | Natural female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aiyue | aiyue | Gentle female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Aijing | aijing | Stern female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Xiaomei | xiaomei | Sweet female voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K/24K | No | No | Standard Edition |
Aina | aina | Female voice with Zhejiang accent | Customer service scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Yina | yina | Female voice with Zhejiang accent | Customer service scenarios | Chinese-only scenarios | 8K/16K/24K | No | No | Standard Edition |
Sijing | sijing | Stern female voice | Customer service scenarios | Chinese-only scenarios | 8K/16K/24K | Yes | No | Standard Edition |
Sitong | sitong | Child voice | Child voice scenarios | Chinese-only scenarios | 8K/16K/24K | No | No | Standard Edition |
Xiaobei | xiaobei | Lolita female voice | Child voice scenarios | Chinese-only scenarios | 8K/16K/24K | Yes | No | Standard Edition |
Aitong | aitong | Child voice | Child voice scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Aiwei | aiwei | Lolita female voice | Child voice scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Aibao | aibao | Lolita female voice | Child voice scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Harry | harry | British English male voice | English scenarios | English scenarios | 8K/16K | No | No | Standard Edition |
Abby | abby | American English female voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Andy | andy | American English male voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Eric | eric | British English male voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Emily | emily | British English female voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Luna | luna | British English female voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Luca | luca | British English male voice | English scenarios | English scenarios | 8K/16K | Yes | No | Standard Edition |
Wendy | wendy | British English female voice | English scenarios | English scenarios | 8K/16K/24K | No | No | Standard Edition |
William | william | British English male voice | English scenarios | English scenarios | 8K/16K/24K | No | No | Standard Edition |
Olivia | olivia | British English female voice | English scenarios | English scenarios | 8K/16K/24K | No | No | Standard Edition |
Shanshan | shanshan | Cantonese female voice | Dialect scenarios | Standard Cantonese (Simplified) and Cantonese-English mixed scenarios | 8K/16K/24K | No | No | Standard Edition |
Aiyuan | aiyuan | Confidante | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aiying | aiying | Cute and soft child voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aixiang | aixiang | Magnetic male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aimo | aimo | Emotional male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aiye | aiye | Young male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aiting | aiting | Radio-style female voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aifan | aifan | Emotional female voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Lydia | lydia | Bilingual English-Chinese female voice | English scenarios | English and English-Chinese mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Xiaoyue | chuangirl | Sichuanese female voice | Dialect scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | No | No | Standard Edition |
Aishuo | aishuo | Natural male voice | Customer service scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Qingqing | qingqing | Taiwan (China) Mandarin female voice | Dialect scenarios | Chinese-only scenarios | 8K/16K | No | No | Standard Edition |
Cuijie | cuijie | Northeastern Mandarin female voice | Dialect scenarios | Chinese-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Xiaoze | xiaoze | Male voice with a strong Hunan accent | Dialect scenarios | Chinese-only scenarios | 8K/16K | No | No | Standard Edition |
Ainan | ainan | Advertisement-style male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aihao | aihao | News-style male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aiming | aiming | Humorous male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aixiao | aixiao | News-style female voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aichu | aichu | Food documentary-style male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Aiqian | aiqian | News-style female voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Zhi Xiang | tomoka | Japanese female voice | Multilingual scenarios | Japanese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Tomoya | tomoya | Japanese male voice | Multilingual scenarios | Japanese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Annie | annie | American English female voice | English scenarios | English-only scenarios | 8K/16K | Yes | No | Standard Edition |
Aishu | aishu | News-style male voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Airu | airu | Newscast female voice | Literary scenarios | Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Premium Edition |
Jiajia | jiajia | Cantonese female voice | Dialect scenarios | Standard Cantonese (Simplified) and Cantonese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Indah | indah | Indonesian female voice | Multilingual scenarios | Indonesian-only scenarios | 8K/16K | No | No | Standard Edition |
Peach | taozi | Cantonese female voice | Dialect scenarios | Supports Standard Cantonese (Simplified) and Cantonese-English mixed scenarios | 8K/16K | Yes | No | Standard Edition |
Guijie | guijie | Friendly female voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Stella | stella | Intellectual female voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Stanley | stanley | Calm male voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Kenny | kenny | Calm male voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Rosa | rosa | Natural female voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Farah | farah | Malay female voice | Multilingual scenarios | Malay-only scenarios | 8K/16K | No | No | Standard Edition |
Mashu | mashu | Children's drama male voice | General scenarios | General scenarios | 8K/16K | Yes | No | Standard Edition |
Zhiqi | zhiqi | Gentle female voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhichu | zhichu | Food documentary-style male voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | Yes | Premium Edition |
Xiaoxian | xiaoxian | Friendly female voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Yuer | yuer | Children's drama female voice | General scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Standard Edition |
Maoxiaomei | maoxiaomei | Energetic female voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Zhixiang | zhixiang | Magnetic male voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhijia | zhijia | Standard female voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhinan | zhinan | Advertisement-style male voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhiqian | zhiqian | News-style female voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhiru | zhiru | Newscast female voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhide | zhide | Newscast male voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Zhifei | zhifei | Passionate commentary voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Premium Edition |
Aifei | aifei | Passionate commentary voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Yaqun | yaqun | Store broadcast voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Qiaowei | qiaowei | Store broadcast voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Dahu | dahu | Northeastern Mandarin male voice | Dialect scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
ava | ava | American girl | English scenarios | English-only scenarios | 8K/16K | Yes | No | Standard Edition |
Zhilun | zhilun | Suspense commentary voice | Ultra-high definition scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Premium Edition |
Ailun | ailun | Suspense commentary voice | Live streaming scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | Yes | Standard Edition |
Jielidou | jielidou | Soothing child voice | Child voice scenarios | Chinese-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Zhiwei | zhiwei | Lolita female voice | Ultra-high definition scenarios | Chinese-only scenarios | 8K/16K/24K/48K | Yes | No | Premium Edition |
Laotie | laotie | Buddy from Northeast China | Live streaming scenarios | Chinese-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Laomei | laomei | Female Hawking Voice | Live streaming scenarios | Chinese-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Aikan | aikan | Tianjin dialect male voice | Dialect scenarios | Chinese-only scenarios | 8K/16K | Yes | Yes | Standard Edition |
Tala | tala | Filipino female voice | Multilingual scenarios | Filipino-only scenarios | 8K/16K | No | No | Standard Edition |
Zhitian | zhitian | Sweet female voice | General scenarios | Supports Chinese and Chinese-English mixed scenarios | 8K/16K | Yes | No | Premium Edition |
Zhiqing | zhiqing | A girl speaking a dialect from Taiwan (China) | Dialect scenarios | Chinese-only scenarios | 8K/16K | Yes | No | Premium Edition |
Tien | tien | Vietnamese female voice | Multilingual scenarios | Vietnamese-only scenarios | 8K/16K | No | No | Standard Edition |
Becca | becca | American English customer service female voice | American English | English-only scenarios | 8K/16K | No | No | Standard Edition |
Kyong | Kyong | Korean female voice | Korean scenarios | Korean | 8K/16K | No | No | Standard Edition |
masha | masha | Russian female voice | Russian scenarios | Russian | 8K/16K | No | No | Standard Edition |
camila | camila | Spanish female voice | Spanish scenarios | Spanish | 8k/16k | No | No | Standard Edition |
perla | perla | Italian female voice | Italian Scenario | Italian | 8k/16k | No | No | Standard Edition |
Zhimao | zhimao | Mandarin female voice | Live streaming | Chinese | 8k/16k | Yes | No | Standard Edition |
Zhiyuan | zhiyuan | Mandarin female voice | General scenarios | Chinese | 8k/16k | Yes | No | Standard Edition |
Zhiya | zhiya | Mandarin female voice | Customer service | Chinese | 8k/16k | Yes | No | Standard Edition |
Zhiyue | zhiyue | Mandarin female voice | General scenarios | Chinese | 8k/16k | Yes | No | Standard Edition |
Zhida | zhida | Mandarin male voice | General scenarios | Chinese and Chinese-English mixed scenarios | 8k/16k | Yes | No | Standard Edition |
Zhistella | zhistella | Mandarin female voice | General scenarios | Chinese | 8k/16k | Yes | No | Standard Edition |
Kelly | kelly | Hong Kong Cantonese female voice | Dialect scenarios | Hong Kong Cantonese | 8k/16k | Yes | No | Standard Edition |
clara | clara | French female voice | General scenarios | French | 8k/16k | No | No | Standard Edition |
hanna | hanna | German female voice | General scenarios | German | 8k/16k | No | No | Standard Edition |
waan | waan | Thai female voice | General scenarios | Thai | 8k/16k | No | No | Standard Edition |
betty | betty | American English female voice | General scenarios | American English | 8k/16k | Yes | No | Standard Edition |
beth | beth | American English female voice | General scenarios | American English | 8k/16k | Yes | No | Standard Edition |
cindy | cindy | American English female voice | General scenarios | American English | 8k/16k | Yes | No | Standard Edition |
donna | donna | American English female voice | General scenarios | American English | 8k/16k | Yes | No | Standard Edition |
eva | eva | American English female voice | General scenario | American English | 8k/16k | Yes | No | Standard Edition |
brian | brian | American English male voice | General scenarios | American English | 8k/16k | Yes | No | Standard Edition |
david | david | American English male voice | General scenarios | American English | 8k/16k/24k | Yes | No | Standard Edition |
abby_ecmix | abby_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
annie_ecmix | annie_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
andy_ecmix | andy_ecmix | American English male voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
ava_ecmix | ava_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
betty_ecmix | betty_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
beth_ecmix | beth_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
brian_ecmix | brian_ecmix | American English male voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
cindy_ecmix | cindy_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
cally_ecmix | cally_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
donna_ecmix | donna_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
david_ecmix | david_ecmix | American English male voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
eva_ecmix | eva_ecmix | American English female voice | General scenarios | English and English-Chinese mixed scenarios | 8k/16k/24k | Yes | No | Standard Edition |
Support for multi-emotion voices
Only multi-emotion voice models support emotion selection. The following table lists the supported emotions. The supported emotion categories vary by voice. The main categories include the following: neutral, happy, angry, sad, fear, hate, surprise, arousal, serious, disgust, jealousy, embarrassed, frustrated, affectionate, gentle, newscast, customer-service, story, and living.
Voice name | voice parameter value | Emotion category |
Zhifeng_emo | zhifeng_emo | angry, fear, happy, neutral, sad, surprise |
Zhibing_emo | zhibing_emo | angry, fear, happy, neutral, sad, surprise |
Zhimiao_emo | zhimiao_emo | serious, sad, disgust, jealousy, embarrassed, happy, fear, surprise, neutral, frustrated, affectionate, gentle, angry, newscast, customer-service, story, living |
Zhimi_emo | zhimi_emo | angry, fear, happy, hate, neutral, sad, surprise |
Zhiyan-Multi-Emotion | zhiyan_emo | neutral, happy, angry, sad, fear, hate, surprise, arousal |
Zhibei_emo | zhibei_emo | neutral, happy, angry, sad, fear, hate, surprise |
Zhitian_emo | zhitian_emo | neutral, happy, angry, sad, fear, hate, surprise |
Service status codes
Each service response contains a `status` field, which is the service status code. The following tables describe the status codes.
General-purpose error codes
Status code | Status message | Cause | Solution |
40000000 | The default client error code. This code corresponds to multiple error messages. | Invalid parameters or call logic was used. | Compare your code with the sample code in the official documentation to test and verify it. |
40000001 | The token 'xxx' has expired. The token 'xxx' is invalid | Invalid parameters or call logic was used. This is a general-purpose client error code that usually indicates an incorrect token, such as an expired or invalid token. | Compare your code with the sample code in the official documentation to test and verify it. |
40000002 | Gateway:MESSAGE_INVALID:Can't process message in state'FAILED'! | The message is invalid or incorrect. | Compare your code with the sample code in the official documentation to test and verify it. |
40000003 | PARAMETER_INVALID Failed to decode url params | The parameters passed by the user are incorrect. This error is common for RESTful API calls. | Compare your code with the sample code in the official documentation to test and verify it. |
40000005 | Gateway:TOO_MANY_REQUESTS:Too many requests! | Too many concurrent requests. | If you are using the Free Edition, you can upgrade to a commercial version to increase the concurrency. If you are already using a commercial version, you can purchase a concurrency resource plan to increase your concurrency quota. |
40000009 | Invalid wav header! | The message header is invalid. | If you send a WAV audio file and set the |
40000009 | Too large wav header! | The WAV header of the transmitted audio is invalid. | You can send the audio stream in a format such as PCM or OPUS. If you use the WAV format, make sure that the WAV header of the audio file contains the correct data length. |
40000010 | Gateway:FREE_TRIAL_EXPIRED:The free trial has expired! | The trial period has ended, and the commercial version is not activated or your account has an overdue payment. | You can log on to the console to check the service activation status and your account balance. |
40010001 | Gateway:NAMESPACE_NOT_FOUND:RESTful url path illegal | The operation or parameter is not supported. | Check whether the parameters passed in the call are consistent with the requirements in the official documentation. You can compare them with the error message to identify and set the correct parameters. For example, if you are using a curl command to make a RESTful API request, check whether the URL you constructed is valid. |
40010003 | Gateway:DIRECTIVE_INVALID:[xxx] | A general-purpose client-side error code. | This error indicates that the client passed an incorrect parameter or instruction. Detailed error messages are available for different operations. You can refer to the corresponding documentation to set the parameters correctly. |
40010004 | Gateway:CLIENT_DISCONNECT:Client disconnected before task finished! | The client actively terminated the connection before the request was processed. | None. Alternatively, you can close the connection after the server responds. |
40010005 | Gateway:TASK_STATE_ERROR:Got stop directive while task is stopping! | The client sent a message instruction that is not currently supported. | Compare your code with the sample code in the official documentation to test and verify it. |
40020105 | Meta:APPKEY_NOT_EXIST:Appkey not exist! | A non-existent Appkey was used. | Confirm whether a non-existent Appkey was used. You can log on to the console and view the project configuration to find the Appkey. |
40020106 | Meta:APPKEY_UID_MISMATCH:Appkey and user mismatch! | The Appkey and token passed in the call were not created by the same Alibaba Cloud account UID. This causes a mismatch. | Check whether you are using resources from two different accounts. Do not use an Appkey from Account A with a token generated from Account B. |
403 | Forbidden | The token is invalid. For example, the token does not exist or has expired. | Set a valid token. Tokens have an expiration period. You must obtain a new token before the current one expires. |
41000003 | MetaInfo doesn't have end point info | Failed to retrieve the routing information for this Appkey. | Check whether you are using resources from two different accounts. Do not use an Appkey from Account A with a token generated from Account B. |
41010101 | UNSUPPORTED_SAMPLE_RATE | The sample rate is not supported. | Real-time speech recognition currently supports only audio with a sample rate of 8000 Hz or 16000 Hz. |
41040201 | Realtime:GET_CLIENT_DATA_TIMEOUT:Client data does not send continuously! | Failed to retrieve data from the client due to a timeout. | When you call real-time speech recognition, the client must send data at a real-time rate and close the connection promptly after the data is sent. |
50000000 | GRPC_ERROR:Grpc error! | An exception caused by factors such as machine load or network issues. This error usually occurs randomly. | You can retry the call to resolve the issue. |
50000001 | GRPC_ERROR:Grpc error! | An exception caused by factors such as machine load or network issues. This error usually occurs randomly. | You can retry the call to resolve the issue. |
52010001 | GRPC_ERROR:Grpc error! | An exception caused by factors such as machine load or network issues. This error usually occurs randomly. | You can retry the call to resolve the issue. |
Speech synthesis/Long-text speech synthesis error codes
Status code | Status message | Cause | Solution |
40000001 | Gateway:ACCESS_DENIED:No privilege to this voice! | An incorrect speaker name was set. | You can refer to the official documentation to set the correct speaker. |
40000004 | Gateway:IDLE_TIMEOUT:Websocket session is idle for too long time,the last directive is 'StartSynthesis'! | After a connection is established, the server returns this error message if no data is sent for more than 10 seconds. | Close the connection promptly after the request is processed. This error may also occur if the server is under high instantaneous pressure and cannot return data in time. In this case, you can retry the request to resolve the issue. |
40010003 | Gateway:DIRECTIVE_INVALID:No text specified! | No valid text for synthesis was set. | You can refer to the sample code in the official documentation to set the text for synthesis. |
41020001 | Speech synthesis client error | Multiple error messages may be returned. Adjust your code based on the specific error message. |
|
51020001 | TTS:TtsServerError | An exception caused by factors such as machine load or network issues. This error usually occurs randomly. | You can retry the call to resolve the issue. |