Client events are JSON messages sent over a WebSocket connection to control the Qwen-TTS Realtime API session -- configure voice settings, stream text for synthesis, and signal completion.
For the full API overview, see Real-time speech synthesis - Qwen.
Event summary
| Client event | Server response | Description |
|---|---|---|
session.update |
session.updated |
Set voice, audio format, interaction mode, and other session parameters |
input_text_buffer.append |
-- | Append text to the synthesis buffer |
input_text_buffer.commit |
input_text_buffer.committed |
Commit buffered text to trigger synthesis |
input_text_buffer.clear |
input_text_buffer.cleared |
Discard all buffered text |
session.finish |
-- | End the session; the server flushes remaining audio and closes the connection |
session.update
Configures the session. Send as the first message after the WebSocket connection is established. If omitted, all parameters use defaults. The server confirms with a session.updated event.
Request body
{
"event_id": "event_123",
"type": "session.update",
"session": {
"voice": "Cherry",
"mode": "server_commit",
"language_type": "Chinese",
"response_format": "pcm",
"sample_rate": 24000,
"instructions": "",
"optimize_instructions": false
}
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_id |
string | Yes | Unique event identifier generated by the client (UUID recommended). Must be unique within the WebSocket session. |
type |
string | Yes | Set to session.update. |
session |
object | No | Session configuration. See the following subsections. |
input_text_buffer.append
Append text to the synthesis buffer.
-
In
server_commitmode, text is appended to the server-side buffer. -
In
commitmode, text is appended to the client-side buffer.
Request body
{
"event_id": "event_B4o9RHSTWobB5OQdEHLTo",
"type": "input_text_buffer.append",
"text": "Hello, I am Qwen."
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_id |
string | Yes | Unique event identifier generated by the client (UUID recommended). Must be unique within the WebSocket session. |
type |
string | Yes | Set to input_text_buffer.append. |
text |
string | Yes | The text to synthesize. |
input_text_buffer.commit
Commits buffered text and creates a user message item. The server responds with an input_text_buffer.committed event.
Returns an error if the buffer is empty.
Behavior differs by mode:
-
server_commitmode: All buffered text is synthesized immediately. The server stops caching and processes everything at once. -
commitmode: Creates user message item from buffered text.
Note: Committing the buffer triggers synthesis only -- it does not generate a model response.
Request body
{
"event_id": "event_B4o9RHSTWobB5OQdEHLTo",
"type": "input_text_buffer.commit"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_id |
string | Yes | Unique event identifier generated by the client (UUID recommended). Must be unique within the WebSocket session. |
type |
string | Yes | Set to input_text_buffer.commit. |
input_text_buffer.clear
Clears buffer text. The server responds with an input_text_buffer.cleared event.
Request body
{
"event_id": "event_2728",
"type": "input_text_buffer.clear"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_id |
string | Yes | Unique event identifier generated by the client (UUID recommended). Must be unique within the WebSocket session. |
type |
string | Yes | Set to input_text_buffer.clear. |
session.finish
Signals no more text will be sent. The server returns remaining audio and closes the connection.
Request body
{
"event_id": "event_2239",
"type": "session.finish"
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
event_id |
string | Yes | Unique event identifier generated by the client (UUID recommended). Must be unique within the WebSocket session. |
type |
string | Yes | Set to session.finish. |