服务端事件参数结构参考手册-大模型服务平台百炼-阿里云

本文介绍 qwen3-livetranslate-flash-realtime API 的服务端事件。

相关文档：实时音视频翻译-通义千问。

error

服务端返回的错误信息。

event_id string

本次事件唯一标识符。

{
  "event_id": "event_RoUu4T8yExPMI37GKwaOC",
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_value",
    "message": "Invalid modalities: ['audio']. Supported combinations are: ['text'] and ['audio', 'text'].",
    "param": "session.modalities"
  }
}

type string

事件类型，固定为error。

error object

错误的详细信息。

属性

type string

错误类型。

code string

错误码。

message string

错误信息。

param string

与错误相关的参数，如session.modalities。

session.created

客户端连接后，服务端返回的第一个事件，包含本次连接的默认配置信息。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.created",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

事件类型，固定为session.created。

session object

会话的配置。

属性

id string

会话的唯一标识符。

object string

固定为realtime.session。

model string

使用的模型。

modalities array

模型输出模态设置。

voice string

模型生成音频的音色。

input_audio_format string

输入音频的格式，固定为pcm16。

output_audio_format string

输出音频的格式，固定为pcm24。

translation object （可选）

翻译配置。

属性

translation string （可选）

设置的翻译目标语种。

session.updated

收到用户的 session.update 请求后，若处理成功，则返回此事件；若出错，则返回 error 事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_QxBGpjBDmDDQQWDtrqBKB",
    "type": "session.updated",
    "session": {
        "id": "sess_OozZ1vtbPt2muDflHODIH",
        "object": "realtime.session",
        "model": "qwen3-livetranslate-flash-realtime",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Ethan",
        "input_audio_format": "pcm16",
        "output_audio_format": "pcm24",
        "translation": {
           "language": "en"
        }
    }
}

type string

事件类型，固定为session.updated。

session object

会话的配置。

属性

id string

会话的唯一标识符。

object string

固定为realtime.session。

model string

使用的模型。

modalities array

模型输出模态设置。

voice string

模型生成音频的音色。

input_audio_format string

输入音频的格式，固定为pcm16。

output_audio_format string

输出音频的格式，固定为pcm24。

translation object （可选）

翻译配置。

属性

translation string （可选）

设置的翻译目标语种。

session.finished

会话结束事件，表示当前会话中，所有音频翻译已完成。

该事件在客户端发送session.finish后才会发送，客户端接收到该事件后可主动断开连接。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_xxx",
    "type": "session.finished"
}

type string

事件类型，固定为session.finished。

response.created

当服务端生成新的模型响应时，会返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_L8hHVI5jYis6BzAjnPWJh",
    "type": "response.created",
    "response": {
        "id": "resp_P79OOMs8LnrXVpiIHUCKR",
        "object": "realtime.response",
        "conversation_id": "conv_UFClXtYkRkFXrs48y8pmK",
        "status": "in_progress",
        "modalities": [
            "text",
            "audio"
        ],
        "voice": "Cherry",
        "output_audio_format": "pcm24",
        "output": []
    }
}

type string

事件类型，固定为response.created。

response object

响应对象。

属性

id string

响应的唯一标识符。

conversation_id string

当前会话的唯一标识符。

object string

对象类型，此事件下固定为realtime.response。

status string

响应状态，取值范围：

completed（已完成）
failed（失败）
in_progress（进行中）
incomplete（不完整）

modalities array

响应的模态。

voice string

模型生成音频的音色。

output_audio_format string

输出音频的格式，固定为pcm24。

output string

此事件下目前为空。

response.done

响应生成完成后，服务端会返回此事件。事件中的 response 对象包含除原始音频数据外的全部输出项。

event_id string

本次事件唯一标识符。

{
  "event_id": "event_CNea8oXNipVanSg2VIzkO",
  "type": "response.done",
  "response": {
    "id": "resp_TfhYTqej692vsGA2jNEtH",
    "object": "realtime.response",
    "conversation_id": "conv_ZtyLfKVm8XqLwYRlsuDih",
    "status": "completed",
    "modalities": [
      "text",
      "audio"
    ],
    "voice": "Cherry",
    "output_audio_format": "pcm24",
    "output": [
      {
        "id": "item_MKtkMwN9RtcyE9eJShyWy",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
          {
            "type": "audio",
            "transcript": "Hello? "
          }
        ]
      }
    ],
    "usage": {
      "total_tokens": 56,
      "input_tokens": 47,
      "output_tokens": 9,
      "input_tokens_details": {
        "text_tokens": 20,
        "audio_tokens": 27
      },
      "output_tokens_details": {
        "text_tokens": 2,
        "audio_tokens": 7
      }
    }
  }
}

type string

事件类型，固定为response.done。

response object

响应对象。

属性

id string

响应的唯一标识符。

conversation_id string

当前会话的唯一标识符。

object string

对象类型，此事件下固定为realtime.response。

status string

响应的状态。

modalities array

响应的模态。

voice string

模型生成音频的音色。

output_audio_format string

输出音频的格式，固定为pcm24。

output object

响应的输出。

属性

id string

响应输出的唯一标识符。

type string

输出项的类型，当前固定为message。

object string

输出项的对象类型，当前固定为realtime.item。

status string

输出项的状态。

role string

输出项的角色。

content array

输出项的内容。

属性

type string

输出内容的类型。输出为纯文本时，为text；输出包含音频时，为audio。

text string

输出的文本内容。

transcript string

音频转录为文字后的内容。

usage object

本次响应的 Token 消耗信息。

response.text.text

当输出模态仅包含文本，且模型增量生成新的文本时，服务端将返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_B1lIeyOXR7qJMEExbqtTG",
    "type": "response.text.text",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How are"
}

type string

事件类型，固定为response.text.text。

text string

返回的增量文本。

response_id string

回复的ID。

item_id string

消息项ID，可以关联同一个消息项。

output_index integer

目前固定为 0。

content_index integer

目前固定为 0。

response.text.done

当输出模态仅包含文本，且模型生成的文本结束时，服务端返回此事件。

当响应中断、不完整或取消时，服务端也会返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_B1lIeE2Nac33zn5V7h2mm",
    "type": "response.text.done",
    "response_id": "resp_B1lIdtjF4Noqpn5NOjznj",
    "item_id": "item_B1lIdJsAJlJiFs8ztWpJt",
    "output_index": 0,
    "content_index": 0,
    "text": "How can I assist you today?"
}

type string

事件类型，固定为response.text.done。

response_id string

响应的唯一标识符。

item_id string

消息项的唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

text string

模型输出的完整文本。

response.audio.delta

当输出模态包含音频，且模型增量生成新的音频数据时，服务端将返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_B1osWMZBtrEQbiIwW0qHQ",
    "type": "response.audio.delta",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "delta": "UklGRnoGAABXQVZFZm10IBAAAAAB..."
}

type string

事件类型，固定为response.audio.delta。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

delta string

模型增量输出的音频数据，使用Base64编码。

response.audio.done

当输出模态包含音频，且模型生成音频结束时，服务端返回此事件。

当响应中断、不完整或取消时，服务端也会返回此事件。

该事件不返回完整音频数据。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_B1osWMWoDRYyITDyNYcBu",
    "type": "response.audio.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0
}

type string

事件类型，固定为response.audio.done。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

conversation.item.input_audio_transcription.text

当配置了input_audio_transcription.model参数时，服务端会流式返回输入音频的语音识别结果（源语言原文）。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.text",
    "item_id": "item_xxx",
    "content_index": 0,
    "text": "",
    "stash": "今天天气真好",
    "language": "zh"
}

type string

事件类型，固定为conversation.item.input_audio_transcription.text。

item_id string

消息项唯一标识符。

content_index integer

目前固定为 0。

text string

已确认的识别文本。

stash string

待确认的识别文本（可能会被后续事件修正）。

language string

检测到的源语种。

conversation.item.input_audio_transcription.completed

当配置了input_audio_transcription.model参数时，语音识别完成后服务端会返回此事件，包含最终的完整识别结果。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_xxx",
    "type": "conversation.item.input_audio_transcription.completed",
    "item_id": "item_xxx",
    "content_index": 0,
    "transcript": "今天天气真好，我们一起去公园散步吧。",
    "language": "zh"
}

type string

事件类型，固定为conversation.item.input_audio_transcription.completed。

item_id string

消息项唯一标识符。

content_index integer

目前固定为 0。

transcript string

完整的语音识别结果（源语言原文）。

language string

检测到的源语种。

response.audio_transcript.text

当输出模态包含音频时，服务端可能返回此事件，用于展示实时翻译内容。

event_id string

本次事件唯一标识符。

{
  "event_id": "event_xxx",
  "type": "response.audio_transcript.text",
  "response_id": "resp_xxx",
  "item_id": "item_xxx",
  "output_index": 0,
  "content_index": 0,
  "text": "Hello,",
  "stash": " who are you?"
}

type string

事件类型，固定为response.audio_transcript.text。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

text string

已确认无误的翻译文本片段。

stash string

初步翻译的临时文本，与当前 text 拼接后构成临时翻译结果；系统会通过 response.audio_transcript.text 事件持续更新 text 和 stash，直至收到response.audio_transcript.done事件，此时可通过 transcript 字段获取完整的最终翻译文本。

response.audio_transcript.done

当输出模态包含音频，且模型生成文本结束时，服务端返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.audio_transcript.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "transcript": "How can I assist you today?"
}

type string

事件类型，固定为response.audio_transcript.done。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

transcript string

完整文本。

response.output_item.added

在响应生成过程中创建新输出项时，服务端返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_B4O5yPt3Gjnjy5eYH3plG",
    "type": "response.output_item.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_OFaPGtzfWCPyGzxnuEX9i",
        "object": "realtime.item",
        "type": "message",
        "status": "in_progress",
        "role": "assistant",
        "content": []
    }
}

type string

事件类型，固定为response.output_item.added。

response_id string

响应的唯一标识符。

output_indexinteger

目前固定为 0。

itemobject

输出项信息。

属性

id string

输出项的唯一标识符。

type string

固定为 message。

object string

始终为 realtime.item 。

status string

输出项的状态。

role string

消息的角色。

content string

消息的内容。

response.output_item.done

当新的项输出完成时，服务端返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_XkiwbYTBC9Wcdwy6uYJ2G",
    "type": "response.output_item.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "output_index": 0,
    "item": {
        "id": "item_JvJauNH2CTXb1D9WV6pD4",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "assistant",
        "content": [
            {
                "type": "audio",
                "text": "你好，我是阿里云研发的大规模语言模型，我叫通义千问。有什么我可以帮助你的吗？"
            }
        ]
    }
}

type string

事件类型，固定为response.output_item.done。

response_id string

响应的唯一标识符。

output_indexinteger

目前固定为 0。

itemobject

输出项信息。

属性

id string

输出项的唯一标识符。

object string

始终为 realtime.item 。

type string

固定为 message。

status string

输出项的状态。

role string

发送消息的角色。

content string

消息的内容。

response.content_part.added

当新的内容部分输出时，服务端返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_J2UixwYKZsXg7c9YXZetL",
    "type": "response.content_part.added",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_OFaPGtzfWCPyGzxnuEX9i",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": ""
    }
}

type string

事件类型，固定为response.content_part.added。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

partobject

输出项信息。

属性

type string

内容部分的类型。

text string

内容部分的文本。

response.content_part.done

当新的内容部分输出完成时，服务端返回此事件。

event_id string

本次事件唯一标识符。

{
    "event_id": "event_VN4Q4GJugLcc1S23viW8E",
    "type": "response.content_part.done",
    "response_id": "resp_P79OOMs8LnrXVpiIHUCKR",
    "item_id": "item_JvJauNH2CTXb1D9WV6pD4",
    "output_index": 0,
    "content_index": 0,
    "part": {
        "type": "audio",
        "text": "你好，我是阿里云研发的大规模语言模型，我叫通义千问。有什么我可以帮助你的吗？"
    }
}

type string

事件类型，固定为response.content_part.done。

response_id string

响应的唯一标识符。

item_id string

消息项唯一标识符。

output_indexinteger

目前固定为 0。

content_indexinteger

目前固定为 0。

partobject

输出项信息。

属性

type string

内容部分的类型。

text string

内容部分的文本。