AIAgentOutboundCallConfig

更新时间:
复制 MD 格式

Parameter

Type

Description

Example

object

Parameters for the AI agent template.

Greeting

string

The welcome message. This change takes effect in the next call session. If this parameter is not set, no welcome message is played.

你好

EnableIntelligentSegment

boolean

Specifies whether to enable intelligent segmentation. If you enable this feature, short and consecutive speech segments from the user are merged into a complete sentence. Default value: true.

true

AsrConfig

object

The automatic speech recognition (ASR) configurations.

AsrMaxSilence

integer

The sentence segmentation threshold. If the duration of a silence exceeds this threshold, the system determines that the sentence is complete. Valid values: 200 to 1200. Unit: ms. Default value: 400.

400

AsrLanguageId

string

The language ID for ASR. Valid values:

  • zh_mandarin: Chinese

  • en: English

  • zh_en: Chinese and English

  • es: Spanish

  • jp: Japanese

zh_mandarin

AsrHotWords

array

The list of hotwords for ASR. You can specify a maximum of 128 hotwords in the list.

string

The hotword string. The string can contain 1 to 10 characters in length.

检查

VadLevel

integer

The interruption threshold for voice activity detection (VAD). Valid values: 0 to 11. Default value: 11.

  • A value of 0 disables the VAD feature.

  • A value from 1 to 10 indicates that the higher the value, the less sensitive the interruption.

  • A value of 11 provides a significantly different experience from the previous values. It lowers audio distortion during conversations and enhances resistance to interference.

11

CustomParams

string

The passthrough parameters for proprietary ASR.

mode=fast&sample=16000&format=wav

VadDuration

integer

The minimum duration threshold for VAD. This parameter controls the interruption sensitivity. A value of 0 indicates that this feature is disabled. Valid values: 200 to 2000. Unit: ms. A value from 200 to 500 corresponds to 1 to 4 words. The default value is empty, which indicates that this parameter is not in effect.

300

LlmConfig

object

The configurations of the large language model (LLM).

LlmHistoryLimit

integer

The maximum number of conversational turns to retain in the history of the LLM or multimodal large language model (MLLM). Default value: 10.

10

LlmHistory

array

The conversation history of the LLM or MLLM.

object

A single conversational turn.

Role

string

The role of the participant in the conversation. Valid values:

  • user: the user

  • assistant: the AI assistant

  • system: the system

  • function: a function

  • plugin: a plug-in

  • tool: a tool

user

Content

string

The text of the conversation content that records the specific expressions or responses of the role in the conversation.

你好

LlmSystemPrompt

string

The system prompt for the LLM after the call is initiated.

你是一位友好且乐于助人的助手,专注于为用户提供准确的信息和建议。

BailianAppParams

string

The parameters for Alibaba Cloud Model Studio. For more information about the parameter format, see Alibaba Cloud Model Studio parameters.

"{\"biz_params\":{\"user_defined_params\":{\"your_plugin_id\":{\"article_index\":2}}},\"memory_id\":\"your_memory_id\",\"image_list\":[\"https://your_image_url\"],\"rag_options\":{\"pipeline_ids\":[\"your_id\"],\"file_ids\":[\"文档ID1\",\"文档ID2\"],\"metadata_filter\":{\"name\":\"张三\"},\"structured_filter\":{\"key1\":\"value1\",\"key2\":\"value2\"},\"tags\":[\"标签1\",\"标签2\"]}}"

OpenAIExtraQuery

string

The additional query parameters for an LLM that is compatible with the OpenAI protocol. The parameters must be in the key=value format. If you specify multiple parameters, separate them with ampersands (&). All values must be of the string type.

api-version=2024-02-01&api-key=sk-xxx

LlmCompleteReply

boolean

If you enable this feature, the system sends the complete LLM-generated result to the client after the generation is complete.

true

FunctionMap

array

The list of function mappings, which is used to map AI agent capabilities to LLM functions. This feature is supported only when function calls are used in custom LLMs that are compatible with the OpenAI protocol.

object

A single mapping rule.

Function

string

The name of the built-in function provided by the AI agent in Alibaba Cloud. The value hangup is supported.

hangup

MatchFunction

string

The name of the LLM function that corresponds to this function. This parameter is customized and used to call the corresponding function in the LLM. For more information about the protocol for custom LLMs, see Standard LLM API.

hangup

OutputMinLength

integer

The minimum length of text output. The unit is characters. Text shorter than this length is cached and waits for concatenation. Valid values: 0 to 100. A value of 0 or empty indicates that this parameter is not in effect. Default value: empty.

5

OutputMaxDelay

string

The maximum delay for text output. If this threshold is exceeded, the cached text is forcibly output. Valid values: 1000 to 10000. Unit: ms. A value of 0 or empty indicates that this parameter is not in effect. Default value: empty.

2000

HistorySyncWithTTS

boolean

Specifies whether to keep the LLM message history consistent with the TTS playback content. Default value: false. If you enable this feature, the saved LLM messages are consistent with the TTS playback content.

false

TtsConfig

object

The text-to-speech (TTS) configurations.

VoiceId

string

The voice ID. The change takes effect on the next sentence. If you do not specify this parameter, the voice ID configured in the AI agent template is used. This parameter is valid only for preset TTS voices. The value can be up to 64 characters in length. For more information about the valid values, see Intelligent speech effect samples.

longcheng_v2

VoiceIdList

array

The list of available voices.

string

The voice ID.

zhixiaoxia

PronunciationRules

array

The TTS pronunciation rules. You can specify a maximum of 20 rules in the array. The rules are executed in sequence.

object

The TTS pronunciation rule.

Word

string

The word to be replaced. The word must be a Chinese character string of up to 10 characters in length and cannot contain spaces.

大栅栏

Pronunciation

string

The target pronunciation. The pronunciation must be a Chinese character string of up to 10 characters in length and cannot contain spaces.

大石烂儿

Type

string

The type of the pronunciation rule. Valid value:

  • replacement: replaces the word with the specified pronunciation.

replacement

ModelId

string

Only MiniMax is supported. Valid values: speech-01-turbo and speech-02-turbo.

speech-01-turbo

LanguageId

string

Only MiniMax is supported. The default value is empty. This parameter enhances the recognition of specific minority languages and dialects. After you set this parameter, the speech performance in the specified minority language or dialect scenarios is improved. If the minority language type is unknown, you can set this parameter to auto to enable the model to automatically determine the minority language type. Valid values:

Supported languages

  • Chinese: Chinese

  • Chinese,Yue: Cantonese

  • English: English

  • Arabic: Arabic

  • Russian: Russian

  • Spanish: Spanish

  • French: French

  • Portuguese: Portuguese

  • German: German

  • Turkish: Turkish

  • Dutch: Dutch

  • Ukrainian: Ukrainian

  • Vietnamese: Vietnamese

  • Indonesian: Indonesian

  • Japanese: Japanese

  • Italian: Italian

  • Korean: Korean

  • Thai: Thai

  • Polish: Polish

  • Romanian: Romanian

  • Greek: Greek

  • Czech: Czech

  • Finnish: Finnish

  • Hindi: Hindi

  • auto: Automatic detection

Chinese

Emotion

string

Only MiniMax is supported. The following seven emotions are supported:

  • happy

  • sad

  • angry

  • fearful

  • disgusted

  • surprised

  • calm

happy

SpeechRate

number

This parameter is supported on all platforms. For CosyVoice, the default value is 1.0 and the valid values are 0.5 to 2.0. For MiniMax, the default value is 1.0 and the valid values are 0.5 to 2.0.

1.0

InterruptConfig

object

The speech interruption policy configurations.

EnableVoiceInterrupt

boolean

Specifies whether to support speech interruption. Default value: true.

true

InterruptWords

array

The specific words or phrases that trigger a conversation interruption.

string

A specific word or phrase that triggers a conversation interruption.

打断一下

Eagerness

string

NoInterruptMode

string

The ASR processing policy in NoInterruptMode.

  • cache: caches the ASR text. The cached ASR text is processed in the next conversational turn.

  • discard: discards the ASR text.

Default value: cache.

cache

KeepInterruptWordsForLLM

boolean

true

TurnDetectionConfig

object

The configurations for conversational turn detection.

TurnEndWords

array

The list of keywords that are used to determine the end of a user's conversational turn.

string

A keyword that is used to determine the end of a user's conversational turn.

我说完了

Mode

string

The mode for conversational turn detection. Valid values:

  • Normal: a basic mode that does not use AI for semantic analysis.

  • Semantic: an AI-powered mode that determines whether the user has finished speaking based on the conversational context.

Default value: Normal.

Semantic

SemanticWaitDuration

integer

The pause duration in AI mode that is used to determine whether a conversational turn has ended. Unit: ms. Default value: -1.

  • -1: The AI automatically determines an appropriate wait time.

  • 0-10000: A custom wait time. We recommend that you set this parameter to a value from 0 to 1500.

Note: This parameter is invalid in Normal mode.

-1

Eagerness

string

Low

GreetingDelay

integer

The delay before the welcome message is played. Unit: ms. Default value: 0. Valid values: 0 to 5000.

0

AmbientSoundConfig

object

The configurations for ambient sound.

ResourceId

string

The ID of the ambient sound. You can obtain the ID from the advanced configurations of the AI agent on the console.

f67901c595834************

Volume

integer

The volume of the ambient sound. Valid values: 0 to 100. A value of 0 disables the sound.

50

ExperimentalConfig

string

The parameters for experimental features. If you have any requirements, contact technical support.

""

AutoSpeechConfig

object

The configurations for the automatic speech module of the AI agent, which includes prompts during LLM delays and inquiries during prolonged user silence.

UserIdle

object

The configurations for inquiry broadcasts during prolonged user silence.

WaitTime

integer

The silence duration threshold. This parameter is required. An inquiry is triggered if this threshold is exceeded. Unit: ms. Valid values: 5000 to 600000.

5000

MaxRepeats

integer

The maximum number of inquiries. This parameter is required. Valid values: 0 to 10. After the maximum number of inquiries is reached, no more inquiries are triggered, and the call is disconnected.

5

Messages

array

The collection of inquiry prompts. You can specify a maximum of 10 prompts. Each prompt can be up to 100 characters in length. The sum of the probabilities of all prompts must be 100%.

object

The structure of an inquiry word.

Text

string

The text of the inquiry prompt. The text can be up to 100 characters in length.

您还在吗?

Probability

number

The selection probability of the prompt. Valid values: 0 to 1, which corresponds to 0% to 100%.

0.5

HangupEndWord

string

LlmPending

object

The configurations for broadcasts during LLM response delays.

WaitTime

integer

The wait time threshold for LLM responses. This parameter is required. A broadcast prompt is triggered if this threshold is exceeded. Unit: ms. Valid values: 500 to 10000. You need to configure this parameter based on the actual usage of the LLM.

3000

Mode

string

Messages

array

The collection of inquiry prompts. You can specify a maximum of 10 prompts. Each prompt can be up to 100 characters in length. The sum of the probabilities of all prompts must be 100%.

object

The structure of an inquiry word.

Text

string

The text of the inquiry prompt. The text can be up to 100 characters in length.

稍等一下

Probability

number

The selection probability of the prompt. Valid values: 0 to 1, which corresponds to 0% to 100%.

0.5

MaxIdleTime

integer

The maximum wait time for interaction with the AI agent. If the wait time is exceeded, the AI agent goes offline. Unit: seconds. Default value: 600.

600

BackChannelingConfig

object

Important This parameter is deprecated. Use BackChannelingConfigs instead.

Enabled

boolean

TriggerStage

string

Probability

number

Words

object

Text

string

Probability

number

BackChannelingConfigs

array

The configurations for the back-channeling feature module. If you enable this feature, the system randomly plays short and affirmative phrases at specific trigger points.

object

A single back-channeling configuration.

Enabled

boolean

Specifies whether to enable the back-channeling feature. This parameter is required. Valid values: true and false.

true

TriggerStage

string

The trigger point for back-channeling. Valid value:

  • pause_detected: triggered when a short pause in speech is detected

pause_detected

Probability

number

The trigger probability. This parameter is required. Valid values: 0.0 to 1.0.

0.5

Words

array

The collection of back-channeling phrases. You can specify a maximum of 10 phrases. Each phrase can be up to 20 characters in length. The sum of the probabilities of all phrases must be 1.0.

object

The configuration of a single back-channeling phrase.

Text

string

The text of the phrase. This parameter is required. The text can be up to 20 characters in length and supports multiple languages.

嗯嗯

Probability

number

The selection probability of this phrase. This parameter is required. Valid values: 0.0 to 1.0.

0.3