Server-Side Callback Notifications

更新时间:
复制 MD 格式

A webhook is an HTTP or HTTPS callback mechanism that enables a service to proactively push data. The RTC callback notification server uses webhooks to send event notifications to your server so you can process business logic as needed.

image

How to Use

Prerequisites

Workflow

  1. Enable event callbacks for an AppID in the console.

    Log on to the RTC console. In the navigation pane on the left, choose Configuration Management > Event Notifications > Select the Target AppID. Configure specific events as needed.

  1. Trigger a callback event.

    After you configure event notifications for an AppID, use the server-side API to start related tasks—such as starting recording or stream ingest—to trigger corresponding callback events.

  2. Receive callback events.

    After a callback event occurs—for example, when a recording file is generated—you can view the callback notification on your deployed callback-receiving server if the callback succeeds.

Callback Mechanism

  1. Deploy an HTTP service to receive callback messages. Then configure the callback URL for your specific business in the console.

  2. When an event occurs, the RTC callback notification server sends an HTTP POST request to that URL. The event notification content is included in the HTTP request body.

  3. Your HTTP service must return an HTTP status code of 200 to confirm a successful callback. Any other status code or timeout counts as a failed callback.

    After failure, the system retries every 10 seconds, up to three times.

  4. After a successful callback, your configured callback URL receives the corresponding event notification content.

Callback Format

Callback messages are sent to your server as HTTP POST requests. The request body format is JSON. Character encoding is UTF-8.

The request header includes these fields:

Field

Example Value

Description

Content-Type

application/json

Static field

trace-id

2401058********622012463d9

Used for troubleshooting.

DingRTC-Signature

z5jbvxxx.1718877424.xx3e7691142ffe4342e13e25dc317695b17827e34ec248a5cc35d3a7e1e1cd44

A signature value generated by the RTC callback service encryption algorithm. For details, see Verify the Signature.

The request body includes these fields:

Name

Type

Required

Example Value

Description

eventId

string

Yes

12343aed*********

Event ID

eventType

string

Yes

101

Event type. See the callback message list below for supported values.

notifyTime

long

Yes

1701056041128

Notification timestamp, in milliseconds

eventData

JSONObject

Yes

{"appId": "z7***u8v"}

Specific callback message content. This varies by event type. See the callback message list below.

Important
  • The order in which your server receives notifications may not match the order in which events occur.

  • To ensure reliable delivery, each event may generate more than one notification. Your server must handle duplicate messages.

Verify the Signature

The RTC callback server uses a signature algorithm to verify the legitimacy of each request sent to your server.

The DingRTC-Signature in the message header consists of three parts, concatenated with .. The format is AppId.TimeStamp.Signature, and the meanings of the fields are as follows:

  • AppID: Application ID

  • TimeStamp: UTC timestamp (in seconds)

  • Signature: Computed from the raw HTTP request body, the timestamp, and the callback secret. The full algorithm is:

Signature=hexString(HmacSHA256(plain request body + TimeStamp, callbackSecret))

Retrieve the callback notification secret from the console.

Use this sample code to verify the signature:

import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import java.nio.charset.StandardCharsets;

public class SignUtil {

    public static final String HMAC_SHA_256 = "HmacSHA256";

    public static String hmacSha256(String message, String secret) {
        try {
            SecretKeySpec signingKey = new SecretKeySpec(secret.getBytes(StandardCharsets.UTF_8), HMAC_SHA_256);
            Mac mac = Mac.getInstance(HMAC_SHA_256);
            mac.init(signingKey);
            byte[] rawHmac = mac.doFinal(message.getBytes(StandardCharsets.UTF_8));
            return bytesToHex(rawHmac);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    public static String bytesToHex(byte[] bytes) {
        StringBuffer sb = new StringBuffer();
        for (byte b : bytes) {
            String hex = Integer.toHexString(b & 0xFF);
            if (hex.length() < 2) {
                sb.append(0);
            }
            sb.append(hex);
        }
        return sb.toString();
    }
    public static void main(String[] args) {
        String requestBody = "{\"eventData\":{\"channelId\":\"55\",\"timestamp\":1718877424674},\"eventId\":\"2133cc0c17188774246986428d0cb0\",\"eventType\":\"101\",\"notifyTime\":1718877424701}";
        String secret = "your callback secret";
        String signatureHeader = "z5jbvxxx.1718877424.b1a2d36af0f43023009d9ff1fb33cfcb075acb94132898bee6a53925fdd0d877";
        String appId = signatureHeader.split("\\.")[0];
        String timestamp = signatureHeader.split("\\.")[1];
        String signature = signatureHeader.split("\\.")[2];
        if (signature.equals(hmacSha256(requestBody + timestamp, secret))) {
            System.out.println("DingRTC-Signature is valid");
        } else {
            System.out.println("DingRTC-Signature is invalid");
        }
    }
}

# !-*- coding: utf-8 -*-
import hashlib
import hmac

request_body='{"eventData":{"channelId":"55","timestamp":1718877424674},"eventId":"2133cc0c17188774246986428d0cb0","eventType":"101","notifyTime":1718877424701}'
secret = 'your callback secret'
signature_header = 'z5jbvxxx.1718877424.b1a2d36af0f43023009d9ff1fb33cfcb075acb94132898bee6a53925fdd0d877'
appId = signature_header.split('.')[0]
timestamp = signature_header.split('.')[1]
signature = signature_header.split('.')[2]
sign_body = request_body + timestamp
if (signature == hmac.new(secret.encode('utf-8'), sign_body.encode('utf-8'), hashlib.sha256).hexdigest()):
    print("DingRTC-Signature is valid")
else:
    print("DingRTC-Signature is invalid")
using System.Security.Cryptography;
using System.Text;

namespace Program
{
    public class Program
    {
        public static string hmacSha256(string message,string secret)
        {
            using (HMACSHA256 mac = new HMACSHA256(Encoding.UTF8.GetBytes(secret)))
            {
                byte[] signing = mac.ComputeHash(Encoding.UTF8.GetBytes(message));
                return bytesToHex(signing);
            }
        }

        private static string bytesToHex(byte[] bytes)
        {
            StringBuilder sb = new StringBuilder();
            foreach (byte b in bytes)
            {
                sb.Append(b.ToString("x2"));
            }
            return sb.ToString();
        }

        public static void Main()
        {
            String requestBody = "{\"eventData\":{\"channelId\":\"55\",\"timestamp\":1718877424674},\"eventId\":\"2133cc0c17188774246986428d0cb0\",\"eventType\":\"101\",\"notifyTime\":1718877424701}";
            String secret = "your callback secret";
            String signatureHeader = "z5jbvxxx.1718877424.b1a2d36af0f43023009d9ff1fb33cfcb075acb94132898bee6a53925fdd0d877";
            String appId = signatureHeader.Split(".")[0];
            String timestamp = signatureHeader.Split(".")[1];
            String signature = signatureHeader.Split(".")[2];
        
            if(signature.Equals(hmacSha256(requestBody + timestamp,secret)))
            {
                Console.WriteLine("DingRTC-Signature is valid");
            }
            else
            {
                Console.WriteLine("DingRTC-Signature is invalid");
            }
        }
    }
}



Callback Message List

This document omits the eventId and notifyTime fields from the JSON examples.

Important

New fields may be added or field order may change. Parse responses according to your programming language.

Verification Events

001 Callback Verification

This event triggers only when you set or manually verify a callback URL in the console.

{
  "eventType": "001",
  "eventData":{
    "appId": "12adxxxx2"
  }
}

Channel Events

101 Channel Started

{
  "eventType": "101",
  "eventData":{
    "channelId": "room**",      // Channel ID
    "timestamp": 1709696165584   // Occurrence time (ms)
  }
}

Channel 102 Ends

{
  "eventType": "102",
  "eventData":{
    "channelId": "room**",      // Channel ID
    "timestamp": 1709696165584   // Occurrence time (ms)
  }
}

103 User Joined

{
  "eventType": "103",
  "eventData":{
    "channelId": "room**",      // Channel ID
    "user":{
      "userId":"123444" 
    },
    "timestamp": 1709696165584   // Occurrence time (ms)
  }
}

104 User Left

{
  "eventType": "104",
  "eventData":{
    "channelId": "room**",      // Channel ID
    "reasonCode": 20003001,     // Reason user left. See status code table.
    "user":{
      "userId":"123444" 
    },
    "timestamp": 1709696165584   // Occurrence time (ms)
  }
}

Stream Ingest Events

1000 Stream Ingest Started

{
    "eventType": "1000",
    "eventData": {
        "channelId": "room**",    // Channel ID
        "liveState":{
          "code": 20000000         // Status code. See status code table.
        },  
        "taskId": "task-03061",   // Task ID
        "timestamp": 1709737037688 // Occurrence time (ms)
    }
}

1001 Stream Ingest Completed

{
    "eventType": "1001",
    "eventData": {
        "channelId": "room**",    // Channel ID
        "liveState":{
          "code": 20000000         // Status code. See status code table.
        },
        "taskId": "task-03061",   // Task ID
        "timestamp": 1709737037688 // Occurrence time (ms)
    }
}

1002 Stream Ingest Failed

{
  "eventType": "1002",
  "eventData": {
    "channelId": "room**",      // Channel ID
    "liveState":{
      "code": 50001001           // Status code. See status code table.
    },  
    "taskId": "task-03061",     // Task ID
    "timestamp": 1709737037688   // Occurrence time (ms)
  }
}

Recording Events

2000 Recording Started

{
    "eventType": "2000",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "bucket":"rtc*******",              // Object Storage Service bucket where recordings are stored
            "vendor":1,                         // Object storage provider. See Start Recording API.
            "region":1,                         // Object storage region. See Start Recording API.
            "startTs":1709737037688,            // Start timestamp, in milliseconds
            "code": 20000000			
        },
        "taskId": "task-0422",
        "timestamp": 1709737037688
    }
}

2001 Recording Succeeded

{
    "eventType": "2001",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "bucket":"rtc*******",              // Object Storage Service bucket where recordings are stored
            "vendor":1,                         // Object storage provider. See Start Recording API.
            "region":1,                         // Object storage region. See Start Recording API.
            "startTs":1709737037688,            // Start timestamp, in milliseconds
            "code": 20000000,                   // Status code. See status code table.
            "fileFailCount": 0,
            "fileInfo": [
                {
                    "fileDuration": 7859,        // File duration, in milliseconds
                    "fileSize": 216777,           // File size, in bytes
                    "filePath": "record/v980**/65e82ef000210**/1709737028486_1709737030532/1709737028486-1709737030532.mp4", // File path
                    "status": 0,                  // 0 = success. Other values = failure.
                    "timestamp": 1709737037679     // File generation timestamp (ms)
                }
            ],
            "fileCount": 1                       // Total number of files
        },
        "taskId": "task-03061",
        "timestamp": 1709737037688
    }
}

2002 Recording Failed

{
    "eventType": "2002",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "bucket":"rtc*******",              // Object Storage Service bucket where recordings are stored
            "vendor":1,                         // Object storage provider. See Start Recording API.
            "region":1,                         // Object storage region. See Start Recording API.
            "startTs":1709737037688,            // Start timestamp, in milliseconds
            "reason": "WritePlaylist failed",
            "code": 50002001,                   // Status code. See status code table.
            "fileFailCount": 2,
            "fileInfo": [
                {
                    "reason": "write flv file fail", // Failure reason
                    "status": 50002001,
                    "timestamp": 1709721091674
                },
                {
                    "reason": "WritePlaylist failed",
                    "fileDuration": 30437,
                    "fileSize": 123875456,
                    "filePath": "taskidtaskId-199-cid65e844**e000000001ac0000/playlist.m3u8",
                    "status": 50002001,
                    "timestamp": 1709721103666
                }
            ],
            "fileCount": 2
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

2003 Single-Stream Recording Succeeded

{
    "eventType": "2003",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "bucket":"rtc*******",              // Object Storage Service bucket where recordings are stored
            "vendor":1,                         // Object storage provider. See Start Recording API.
            "region":1,                         // Object storage region. See Start Recording API.
            "startTs":1709737037688,            // Start timestamp, in milliseconds
            "fileInfo": [
                {
                    "fileSize": 313074,
                    "filePath": "record/cu***p/112233/b07c****6/122221/mic_default_0_0_1757484943513.mp3",
                    "fileDuration": 19559,
                    "status": 0,
                    "timestamp": 1757484964292
                }
            ],
            "streamInfo": {
                "type": "mic",  
                "deviceId": "default",
                "userId": "122221"
            }
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

2010 Recording Service Status Changed

Note

This event does not trigger automatically. Subscribe via the console or OpenAPI.

{
    "eventType": "2010",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "bucket":"rtc*******",              // Object Storage Service bucket where recordings are stored
            "vendor":1,                         // Object storage provider. See Start Recording API.
            "region":1,                         // Object storage region. See Start Recording API.
            "startTs":1709737037688,            // Start timestamp, in milliseconds
            "code": 20002002                    // Status code. See status code table.
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

2011 Audio Stream Status Changed

Note

This event does not trigger automatically. Subscribe via the console or OpenAPI.

{
    "eventType": "2011",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "streamChangeInfo": {            // Stream change info
                "streamType": 3,             // Stream type: 1 = camera, 2 = screen share, 3 = audio mix, 4 = video mix
                "state": 1,                  // Recording state: 1 = receiving, 2 = not receiving
                "direction": 2,              // Stream direction: 1 = input, 2 = output
                "timestamp": 1721112755076   // Unix timestamp (ms) of state change
            }
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

2012 changes to recorded video streams

Note

This event does not trigger automatically. Subscribe via the console or OpenAPI.

{
    "eventType": "2012",
    "eventData": {
        "channelId": "room**",
        "recordState": {
            "streamChangeInfo": {            // Stream change info
                "uid": "user1",              // UID. Empty if direction = output (mix stream).
                "streamType": 1,             // Stream type: 1 = camera, 2 = screen share, 3 = audio mix, 4 = video mix
                "state": 1,                  // Recording state: 1 = receiving, 2 = not receiving
                "direction": 1,              // Stream direction: 1 = input, 2 = output
                "timestamp": 1721112755076   // Unix timestamp (ms) of state change
            }
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

Event Summary

Start of minutes 3000

{
    "eventType": "3000",
    "eventData": {
        "channelId": "room**",
        "asrState": {
            "code": 20000000                   // Status code. See status code table.
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

3001 Notes Succeeded

Important

Note: Real-time subtitles use event 3003. They are delivered in real time and are not saved to files. Their behavior differs from event 3001.

{
    "eventType": "3001",
    "eventData": {        
        "asrState": {
            "transcriptionFilePath": "cloudNote/6pz38941/1234_1234/transcription_1734069823271.json",           // Transcription result
            "serviceInspectionFilePath": "cloudNote/6pz38941/1234_1234/serviceInspection_1734069824007.json",   // Service inspection result
            "customPromptFilePath": "cloudNote/6pz38941/1234_1234/customPrompt_1734069824057.json",             // Custom prompt result 
            "meetingAssistanceFilePath": "cloudNote/6pz38941/1234_1234/meetingAssistance_1734069823787.json",   // Key points result 
            "summarizationFilePath": "cloudNote/6pz38941/1234_1234/summarization_1734069823845.json",           // Summary result
            "textPolishFilePath": "cloudNote/6pz38941/1234_1234/textPolish_1734069823903.json",                 // Speech-to-text polish result
            "autoChaptersFilePath": "cloudNote/6pz38941/1234_1234/autoChapters_1734069823728.json",             // Auto-chapter result
            "vendor": 1,                                                                                        // Object storage provider
            "region": 1,                                                                                        // Object storage region
            "bucket": "rtc-qa-test"                                                                             // Bucket name   
        },
        "channelId": "room**",
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}
Note

Result file path format: cloudNote/{appId}/{channelId}_{taskId}/{biz}_{putTs}.json

Transcription Result Example and Field Definitions

{
    "TaskId":"10683ca4ad3f4f06bdf6e9dc*********",
    "Transcription":{
        "AudioInfo": {
            "Size": 670663,
            "Duration": 10394,
            "SampleRate": 48000,
            "Language": "cn"
        },
        "Paragraphs":[
            {
                "ParagraphId":"16987422100275*******",
                "SpeakerId":"1",
                "Words":[
                    {
                        "Id":10,
                        "SentenceId":1,
                        "Start":4970,
                        "End":5560,
                        "Text":"Hello,"
                    },
                    {
                        "Id":20,
                        "SentenceId":1,
                        "Start":5730,
                        "End":6176,
                        "Text":"I am"
                    }
                ]
            }
        ],
        "AudioSegments": [
            [12130, 16994],
            [17000, 19720],
            [19940, 28649]
        ]
    }
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

Transcription

object

Transcription result object.

Transcription.Paragraphs

list[]

Collection of transcription paragraphs.

Transcription.Paragraphs[i].ParagraphId

string

Paragraph ID.

Transcription.Paragraphs[i].SpeakerId

string

Speaker ID.

Transcription.Paragraphs[i].Words

list[]

Word information in this paragraph.

Transcription.Paragraphs[i].Words[i].Id

int

Word sequence number. Usually not required.

Transcription.Paragraphs[i].Words[i].SentenceId

int

Sentence ID. Words with the same SentenceId form one sentence.

Transcription.Paragraphs[i].Words[i].Start

long

Start time relative to audio start, in milliseconds.

Transcription.Paragraphs[i].Words[i].End

long

End time relative to audio start, in milliseconds.

Transcription.Paragraphs[i].Words[i].Text

string

Word text.

Transcription.AudioInfo

object

Audio information object.

Transcription.AudioInfo.Size

long

Audio size, in bytes.

Transcription.AudioInfo.Duration

long

Audio duration, in milliseconds. (For real-time transcription, this is not the actual audio duration.)

Transcription.AudioInfo.SampleRate

int

Audio sampling rate.

Transcription.AudioInfo.Language

string

Audio language.

Transcription.AudioSegments

list[][]

Valid audio segment ranges.

Transcription.AudioSegments[i][0]

int

Start time of valid audio segment, in milliseconds.

Transcription.AudioSegments[i][1]

int

End time of valid audio segment, in milliseconds.

Service Inspection Result Example and Field Definitions

{
    "TaskId": "4ee872e72fd0490694f1cd615b6b6314",
    "ServiceInspection": [
        {
            "Title": "Greeting at Store - Welcome Phrase",
            "Matched": true,
            "Remarks": "The salesperson started the conversation by asking questions, showing intent to greet.",
            "MatchedSentenceIds": [

            ]
        },
        {
            "Title": "Farewell at Store Exit - Collect Contact Info",
            "Matched": true,
            "Remarks": "The salesperson suggested adding the customer on DingTalk for future contact.",
            "MatchedSentenceIds": [

            ]
        },
        {
            "Title": "Greeting at Store - Offer Beverage",
            "Matched": false,
            "Remarks": "No mention of offering a beverage in the conversation.",
            "MatchedSentenceIds": [

            ]
        }
    ]
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

ServiceInspection

list[]

Collection of service inspection results. May contain zero, one, or multiple items.

ServiceInspection[i].Title

string

Name of the service inspection item. Matches ServiceInspection.InspectionContents[i].Title in the request.

ServiceInspection[i].Matched

boolean

Whether this service inspection item matched.

ServiceInspection[i].Remarks

string

Large Language Model analysis of this inspection item.

ServiceInspection[i].MatchedSentenceIds

list[]

Sentence IDs in the original transcript that matched this inspection item.

Custom Prompt Result Example and Field Definitions

{
  "TaskId": "c8b8f8cac1134675a8722ae3********",
  "CustomPrompt": [
    {
      "Name": "split-summary-demo",
      "Result": "This conversation focuses on DingTalk's voice technology and related AI research achievements. Speaker 1 (Jing Chang) represents DingTalk and introduces their research outcomes and future vision. Speaker 2 is a science communication video creator from Xigua Video who explores and shares DingTalk's technology through questions and observations.\n\nSpeaker 1 begins by noting that predicting future technologies carries high risk but remains essential to convey perspectives and attitudes toward emerging tech. He then discusses DingTalk's global presence, highlighting its headquarters and research institute in Hangzhou. Jing Chang also mentions DingTalk's report on the top ten future technologies and expresses interest in cutting-edge AI research, especially voice technology.\n\nIn discussing voice technology, Jing Chang highlights challenges in noisy environments, multi-person meetings, and interference from various devices. He explains current technical solutions, such as using machine learning to classify problems before routing them to human agents. He also describes a long-term goal: enabling AI to participate in meetings to improve efficiency and decision quality.\n\nSpeaker 2 observes from an external perspective, experiencing and demonstrating DingTalk's voice technology in real-world applications like voice-controlled vending machines and smart TVs. He illustrates practical potential and challenges with intuitive examples.\n\nFinally, both speakers emphasize the goal of making voice interaction ubiquitous and discuss how this technology can better serve society and how enterprises should remain user-centric during continuous development.\n\nIn summary, this dialogue explores DingTalk's progress and future plans in voice technology and broader AI fields, while reflecting on real-world applications and potential impacts.",
      "Truncated": false
    },
    {
      "Name": "inspection-demo",
      "Result": "None",
      "Truncated": false
    }
  ]
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

CustomPrompt

list[]

List of custom prompt results.

CustomPrompt.Name

string

Matches CustomPrompt.Contents[i].Name in the request.

CustomPrompt.Result

string

Large Language Model result.

CustomPrompt.Truncated

boolean

Whether truncation occurred.

Key Points Result Example and Field Definitions

{
    "TaskId":"8b78c180e034fe9097e9135s7ebba1fa",
    "MeetingAssistance":{
        "Keywords":[
            "DingTalk",
            "Alibaba",   
            "Voice"
        ],
        "KeySentences":[
            {
                "Id":1,
                "SentenceId":1,
                "Start":31680,
                "End":36582,
                "Text":"First, let me introduce our work and job requirements."
            },
            {
                "Id":2,
                "SentenceId":45,
                "Start":1452950,
                "End":1462184,
                "Text":"Our main focus is voice technology. We come from the Voice Lab and specialize in speech-to-text and voice-related cloud services."
            }
        ],
        "Actions":[
            {
                "Id":1,
                "SentenceId":8,
                "Start":39654,
                "End":52117,
                "Text":"Confirm whether there are issues with the content in the PPT template."
            },
            {
                "Id":2,
                "SentenceId":18,
                "Start":84693,
                "End":86786,
                "Text":"Monitor DingTalk trial usage and upcoming releases."
            }
        ],
        "Classifications":{
            "Interview":0.6549709,
            "Lecture":0.18346232,
            "Meeting":0.16156682
        }
    }
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

MeetingAssistance

object

Key points result object. May contain zero or more result types.

MeetingAssistance.Keywords

list[]

Extracted keywords.

MeetingAssistance.KeySentences

list[]

Extracted key sentences, also called key content.

MeetingAssistance.KeySentences[i].Id

long

Key sentence sequence number.

MeetingAssistance.KeySentences[i].SentenceId

long

Sentence ID in the original ASR transcript.

MeetingAssistance.KeySentences[i].Start

long

Start time relative to audio start, in milliseconds.

MeetingAssistance.KeySentences[i].End

long

End time relative to audio start, in milliseconds.

MeetingAssistance.KeySentences[i].Text

string

Key sentence text.

MeetingAssistance.Actions

list[]

Collection of action items and summaries.

MeetingAssistance.Actions[i].Id

long

Action item sequence number.

MeetingAssistance.Actions[i].SentenceId

long

Sentence ID in the original ASR transcript.

MeetingAssistance.Actions[i].Start

long

Start time relative to audio start, in milliseconds.

MeetingAssistance.Actions[i].End

long

End time relative to audio start, in milliseconds.

MeetingAssistance.Actions[i].Text

string

Action item text.

MeetingAssistance.Classifications

object

Scene classification. Currently supports three scene types.

MeetingAssistance.Classifications.Interview

float

Confidence score for interview scenes.

MeetingAssistance.Classifications.Lecture

float

Confidence score for lecture scenes.

MeetingAssistance.Classifications.Meeting

float

Confidence score for meeting scenes.

Summary Result Example and Field Definitions

{
  "TaskId": "5a7343ad75e6493da121ce65*********",
  "Summarization": {
    "ParagraphSummary": "Introduces Alibaba DingTalk's audio-video team and job requirements. Mentions cloud services for meeting notes. Also answers other questions and introduces multimodal projects.",
    "ConversationalSummary": [
      {
        "SpeakerId": "1",
        "SpeakerName": "Speaker 1",
        "Summary": "Introduces Alibaba DingTalk's work and job requirements, focusing on speech-to-text and text-to-speech cloud services. Explains that DingTalk aims to provide a unified interface for cloud-based services. Describes features of the meeting notes product, including summarization, keyword extraction, and multimodal capabilities. The product will launch at month-end for user access."
      },
      {
        "SpeakerId": "2",
        "SpeakerName": "Speaker 2",
        "Summary": "He leads AI capability development and business integration for NLP. Introduces three company projects. Discusses challenges and solutions in grading, and the company's exploration of multimodal projects."
      }
    ],
    "QuestionsAnsweringSummary": [
      {
        "Question": "What kind of department is DingTalk Audio-Video?",
        "SentenceIdsOfQuestion": [
          207,
          208,
          209,
          210
        ],
        "Answer": "DingTalk Audio-Video is a business unit under Alibaba Group's DingTalk, responsible for audio-video services.",
        "SentenceIdsOfAnswer": [
          207,
          208,
          209,
          210
        ]
      }
    ],
    "MindMapSummary": [
      {
        "Title": "Summary of Alibaba DingTalk Voice Technology and Smart Device Site Visit",
        "Topic": [
          {
            "Title": "1. DingTalk Introduction",
            "Topic": [
              {
                "Title": "Headquarters in Hangzhou, offices worldwide",
                "Topic": []
              },
              {
                "Title": "Key Research Areas and Achievements",
                "Topic": [
                  {
                    "Title": "Top Ten Future Technology Trends Forecast",
                    "Topic": []
                  },
                  {
                    "Title": "Ongoing High-Risk R&D Projects",
                    "Topic": []
                  }
                ]
              }
            ]
          },
          {
            "Title": "2. Voice Technology Discussion",
            "Topic": [
              {
                "Title": "Lab Environment Challenges",
                "Topic": [
                  {
                    "Title": "Strong noise, multi-person conversations, interference from multiple devices",
                    "Topic": []
                  }
                ]
              },
              {
                "Title": "Technical Challenges",
                "Topic": [
                  {
                    "Title": "Target speaker identification",
                    "Topic": []
                  },
                  {
                    "Title": "Semantic understanding: emotional distinction of homophones",
                    "Topic": []
                  }
                ]
              },
              {
                "Title": "Current Applications",
                "Topic": [
                  {
                    "Title": "Call center automation",
                    "Topic": []
                  },
                  {
                    "Title": "Optimized speech recognition in meetings",
                    "Topic": []
                  }
                ]
              },
              {
                "Title": "Future Outlook",
                "Topic": [
                  {
                    "Title": "AI participation in meetings to improve decision quality",
                    "Topic": []
                  },
                  {
                    "Title": "Widespread application of voice technology across industries",
                    "Topic": []
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

Summarization

object

Summary result object. May contain zero or more summary types.

Summarization.ParagraphSummary

string

Full-text summary.

Summarization.ConversationalSummary

list[]

Speech summary results list.

Summarization.ConversationalSummary[i].SpeakerId

string

Speaker ID.

Summarization.ConversationalSummary[i].SpeakerName

string

Speaker name.

Summarization.ConversationalSummary[i].Summary

string

Summary for this speaker.

Summarization.QuestionsAnsweringSummary

list[]

List of question-answer summaries.

Summarization.QuestionsAnsweringSummary[i].Question

string

Question.

Summarization.QuestionsAnsweringSummary[i].SentenceIdsOfQuestion

list[]

List of SentenceIds in the original ASR transcript that correspond to this question.

Summarization.QuestionsAnsweringSummary[i].Answer

string

Answer to the question.

Summarization.QuestionsAnsweringSummary[i].SentenceIdsOfAnswer

list[]

List of SentenceIds in the original ASR transcript that correspond to this answer.

Summarization.MindMapSummary

list[]

List of mind map results.

Summarization.MindMapSummary[i].Title

string

Text content of a single mind map node.

Summarization.MindMapSummary[i].Topic

list[]

List of child nodes for a single mind map node.

Speech-to-Text Polish Result Example and Field Definitions

{
    "TaskId": "742efa3a71f7475fae81a060********",
    "TextPolish": [
        {
            "FormalParagraphText": "The Apsara Conference, a major industry event in China, serves as a common forum for digital-era discussions. Alibaba supports young scientists' pursuit of scientific advancement through the Qingcheng Award. Alibaba Cloud continuously develops cloud computing and digital ecosystems, aiming to become a world-leading computing infrastructure. Alibaba also pursues technological advancement, including self-controlled cloud operating systems and training algorithms and models. Alibaba aims to let ordinary people enjoy and create data-driven results through low-code environments. Additionally, Alibaba strives to break through core technologies in chip development.",
            "SentenceIds": [
                1,
                2
            ],
            "ParagraphId": "1708487265101500000",
            "Start": 30,
            "End": 9600
        },
        {
            "FormalParagraphText": "Pingtouge designed the YiTian 710 processor for cloud computing scenarios and helped host the Winter Olympics on the cloud. Cloud computing brings new production and management methods to all industries, driving China toward modernization. Alibaba pursues technological advancement and assumes greater social responsibility, striving to make cloud computing a sustainable green power source.",
            "SentenceIds": [
                3,
                4,
                5
            ],
            "ParagraphId": "1708487280411500000",
            "Start": 15340,
            "End": 17790
        }
    ]
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

TextPolish

list[]

A collection of zero or more spoken-to-written conversion entries.

TextPolish[i].FormalParagraphText

string

Transcribed text converted from spoken language.

TextPolish[i].SentenceIds

list[]

List of SentenceIds corresponding to this polished text.

TextPolish[i].ParagraphId

string

Paragraph ID of the text result. Matches the paragraph ID in the ASR transcript.

TextPolish[i].Start

long

Start time of the text result in the original audio, in milliseconds.

TextPolish[i].End

long

End time of the text result in the original audio, in milliseconds.

Auto-Chapter Result Example and Field Definitions

{
    "TaskId":"05c45066fc6df96dg09bf8z4*********",
    "AutoChapters":[
        {
            "Id":1,
            "Start":1930,
            "End":283874,
            "Headline":"Apsara Conference and Alibaba's Technological Responsibility",
            "Summary":"The Apsara Conference, a major industry event in China, serves as a common forum for digital-era discussions. Alibaba supports young scientists' pursuit of scientific advancement through the Qingcheng Award. Alibaba Cloud continuously develops cloud computing and digital ecosystems, aiming to become a world-leading computing infrastructure. Alibaba also pursues technological advancement, including self-controlled cloud operating systems and training algorithms and models. Alibaba aims to let ordinary people enjoy and create data-driven results through low-code environments. Additionally, Alibaba strives to break through core technologies in chip development."
        },
        {
            "Id":2,
            "Start":284050,
            "End":452084,
            "Headline":"Cloud Computing: Driving China Toward Modernization",
            "Summary":"Pingtouge designed the YiTian 710 processor for cloud computing scenarios and helped host the Winter Olympics on the cloud. Cloud computing brings new production and management methods to all industries, driving China toward modernization. Alibaba pursues technological advancement and assumes greater social responsibility, striving to make cloud computing a sustainable green power source."
        }
    ]
}

Parameter Name

Type

Description

TaskId

string

Internal meeting notes ID. Use it for troubleshooting.

AutoChapters

list[]

Collection of auto-chapter results. May contain zero, one, or multiple items.

AutoChapters[i].Id

int

Chapter sequence number.

AutoChapters[i].Start

long

Start time of the chapter relative to audio start, in milliseconds.

AutoChapters[i].End

long

End time of the chapter relative to audio start, in milliseconds.

AutoChapters[i].Headline

string

One-sentence headline for the chapter.

AutoChapters[i].Summary

string

Chapter summary.

3002 Notes Failed

{
    "eventType": "3002",
    "eventData": {
        "channelId": "room**",
        "asrState": {
            "code": 50004001                   // Status code. See status code table.
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

3003 Real-Time Subtitles

Note

This event does not trigger automatically. Subscribe via the console or OpenAPI. Set the AsrCallback field to true in the StartCloudNote API.

{
    "eventType": "3003",
    "eventData": {
        "channelId": "room**",
        "asrState": {
            "sentenceIndex": 14, # Global sentence index
            "sentenceEnd": true, # Whether the sentence ends
            "beginTime": 40680, # Start timestamp
            "endTime": 53280, # End timestamp
            "text": "I am a service expert.", # Subtitle text
            "userId": "471812" # User ID
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

Agent Events

4,000 agents connected successfully

{
    "eventType": "4000",
    "eventData": {
        "channelId": "room**",
        "aiAgentState": {
            "code": 20000000                   // Status code. See status code table.
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

4001 Agent Failed to Join Channel

{
    "eventType": "4001",
    "eventData": {
        "channelId": "room**",
        "aiAgentState": {
            "code": 50005001, 
            "reason": "join rtc channel failed"                    
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

4002 Agent Exited

{
    "eventType": "4002",
    "eventData": {
        "channelId": "room**",
        "aiAgentState": {
            "code": 50005010, 
            "reason": "exit without user"                    
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

4003 Agent Internal Error

{
    "eventType": "4003",
    "eventData": {
        "channelId": "room**",
        "aiAgentState": {
           "code": 50005050, 
           "reason": "asr internal error"                  
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

4004 Agent Status Notification

{
    "eventType": "4004",
    "eventData": {
        "channelId": "room**",
        "aiAgentState": {
            "code": 50005020,
            "reason": "agent long silence"                    
        },
        "taskId": "taskId-199",
        "timestamp": 1709721103673
    }
}

Status Code Table

Type

Status Code

Description

Common

20000000

Success

50000000

Internal server error

Stream Ingest

50001001

Stream ingest failed

Recording

50002001

Writing to user storage failed,

This may be caused by a network issue.

50002002

Failed to start user storage.

The input parameters AK, SK, Bucket, Region, or Vendor may have been entered incorrectly.

50002003

Recording duration too short. No recording file generated.

50002004

Invalid user storage key

50002005

Bucket does not exist

50002006

Access to user storage denied

50002007

Unknown error accessing user storage

50002008

Recording processing failed

20002001

No cloud recording started

20002002

Cloud recording initialization complete

20002003

Recording component starting

20002004

Recording component started

20002005

Recording stopped

20002006

Upload component started

20002007

First file uploaded successfully

User

20003001

Client exited voluntarily

20003002

Client keepalive failed

20003003

User kicked out

20003004

Same UID removed

20003005

Unknown exit reason

Meeting Notes

50004001

Meeting notes server error

50004002

Meeting notes task exceeded maximum time

30006001

Invalid user AK/SK/Bucket configuration

Agent

50005001

join rtc channel failed

50005002

join rtc task exceed limit

50005003

join rtm channel failed

50005010

exit without user

50005011

exit rtc bye

50005050

asr internal error

50005051

llm intrtnal error

50005052

tts internal error

50005020

agent long silence