FAQ

更新时间:
复制 MD 格式

What are the recording length limits for different question types?

Language

Question type

Length limit (seconds)

English

Phonetic symbol

20

Word

20

Sentence

40

Paragraph

300

Pronunciation correction

20

Phonics

20

Multiple choice

60

Extended choice

60

Q&A pair

60

Spoken composition

300

English recitation

300

Short sentence recognition

60

Long-form recognition

300

Chinese

Character

20

Sentence

40

Paragraph

300

Poem

300

Pinyin

300

Chinese recitation

300

Short sentence recognition

60

Long-form recognition

300

What audio formats are supported?

Supported audio formats include WAV, MP3, OGG (OGG container with Speex compression), OGG Opus (OGG container with Opus compression), WeChat Speex (WeChat high-definition audio format), and AMR.

How is speech scored for words?

Scoring is based on accuracy down to the phoneme level. Alibaba Cloud's speech analysis technology can also detect syllable stress in English words.

How is speech scored for sentences?

Completeness, accuracy, fluency, and flow

1. Integrity: Measures the percentage of words from the reference text that were read correctly.

2. Accuracy: Evaluates the pronunciation accuracy of words, phrases, and sentences.

3. Fluency: Evaluates speech rate, variations in speech rate, and pauses between thought groups.

4. Prosody: Evaluates prosody based on fluctuations and changes in the fundamental frequency of the voice. (This dimension can be evaluated only after annotation.)

Alibaba Cloud's speech analysis technology can also detect stressed words and sentence-final intonation changes in English sentences.

If your product shows only a total score, use the `pron` field instead of the `overall` field. The `overall` score may appear too high to users. For example, a user might read the full text but with poor pronunciation. Because integrity is a scoring dimension, the `overall` score will be higher than the `pron` score, which focuses more on pronunciation.

How is speech scored for paragraphs?

Paragraph reading and repetition are evaluated based on pronunciation accuracy, integrity, fluency, and prosody.

1. Integrity: Measures the percentage of words from the reference text that were read correctly.

2. Accuracy: Evaluates the pronunciation accuracy of words, phrases, and sentences.

3. Fluency: Evaluates speech rate, variations in speech rate, and pauses between thought groups.

How is speech scored for open-ended questions?

The evaluation combines speech, semantics, and syntax models.

Relationship between the overall score and individual dimension scores

The total score is calculated by a model. The model considers factors such as the pronunciation of each word, including missed or mispronounced words. It also considers the integrity, accuracy, fluency, and prosody of the entire sentence. Note: The model is based on an analysis of sentence scores from many experts. Pronunciation is heavily weighted, but the relationship is not a simple linear one.

1. Integrity: Measures the coverage of the user's reading compared to the reference answer.

2. Accuracy: Evaluates the pronunciation accuracy of words, phrases, and sentences.

3. Fluency: Evaluates speech rate, variations in speech rate, and pauses between thought groups.

How are the weights for each scoring criterion in the overall pronunciation score determined?

Alibaba Cloud provides well-defined Application Programming Interfaces (APIs). The speech analysis service generates extensive data across multiple layers, such as phonemes, syllables, sentences, and paragraphs. It also provides multiple metrics, including pronunciation, fluency, syllable stress detection, sentence-final intonation detection, and confusable word detection. This data is available to you. How you present this data to your end users depends on your application's requirements and design.

How do I set the request address whitelist for a WeChat mini program?

Location: Go to the mini program management page and navigate to Developer & Services > Development Management > Development Settings.

WeChat mini program setting

Whitelisted domain names

Legal domain names for request

https://api.cloud.ssapi.cn

https://files.cloud.ssapi.cn

https://gate-01.api.cloud.ssapi.cn

https://gate-02.api.cloud.ssapi.cn

https://gate-03.api.cloud.ssapi.cn

https://static-gate-01.api.cloud.ssapi.cn

Legal domain names for socket

wss://api.cloud.ssapi.cn

wss://gate-01.api.cloud.ssapi.cn

wss://gate-02.api.cloud.ssapi.cn

wss://gate-03.api.cloud.ssapi.cn

wss://static-gate-01.api.cloud.ssapi.cn

Legal domain names for uploadFile

https://api.cloud.ssapi.cn

https://files.cloud.ssapi.cn

https://gate-01.api.cloud.ssapi.cn

https://gate-02.api.cloud.ssapi.cn

https://gate-03.api.cloud.ssapi.cn

https://idc6-ginger.api.cloud.ssapi.cn

https://idc7-ginger.api.cloud.ssapi.cn

https://v3.aes.ssapi.cn

Legal domain names for downloadFile

https://files.cloud.ssapi.cn

https://idc6-ginger.api.cloud.ssapi.cn

https://idc7-ginger.api.cloud.ssapi.cn