Filter vocalized content-Intelligent Media Services(IMS)-阿里云帮助中心

You can skip specific content in an agent's vocalized responses to deliver a more natural conversational experience.

Overview

An agent's response may contain auxiliary text not intended to be spoken aloud, such as action cues ('adjusts tie'), status markers ('thinking...'), or session terminators ('end of reply'). The text-to-speech (TTS) node provides content filtering to exclude such text from vocalization.

Configure TTS filtering

TTS filtering removes text enclosed in specific bracket types. The following five bracket types are supported:

Full-width parentheses: （）
Half-width parentheses: ()
Full-width square brackets: 【】
Half-width square brackets: []
Curly braces: {}

An agent's response may contain one or more of these bracket types. You can configure the TTS node to filter out text enclosed in any of these brackets. Filtered content is displayed but not vocalized.

Note

If brackets of the same or different types are nested, the filter uses a greedy match and filters all text within the outermost specified pair of brackets.
If brackets in the response text are mismatched or incomplete, the filtering is not applied.

Procedure

Go to the Intelligent Media Service console and click the desired workflow.
On the workflow details page, click Edit in the upper-right corner.
Select the text-to-speech node and configure the Filter option.

In the Filter settings, select the desired bracket types and click Save.
Click Save to apply the changes to the workflow.

Examples

The following example shows how text within 【】 and {} is filtered:

- User: The weather is nice today.
- Displayed text: {{emotion=neutral}} Yes, it's sunny.
- Vocalized content: Yes, it's sunny.
- User: How's the weather today?
- Displayed text: 【smiling】 The weather is good today （waves hand）.
- Vocalized content: The weather is good today （waves hand）.