Implement chat features using the RTOS SDK (License mode)

更新时间:
复制 MD 格式

This topic describes how to implement chat features using the RTOS software development kit (SDK) in License mode.

1. Prepare for development

Follow the instructions in the Create Application document to create an application, purchase a license, and obtain an App ID and AppSecret.

Refer to RTOS C SDK (License mode) and download the SDK package that corresponds to your chip model.

If you use the semi-managed pattern, follow the instructions in RTOS C SDK (License pattern) to complete the integration with server-side APIs.

2. Adapt the SDK

Note

In the examples and logs in this document, the ID in formats such as <ID> represents actual data. The data shown in this document is anonymized.

Each chip platform requires a specific toolchain for compilation. You can download the SDK directly for chips that are already supported by Alibaba Cloud Model Studio.

If you are using a new chip platform, contact Alibaba Cloud sales for technical support to obtain the SDK. The SDK package is named as follows:

aliyun_sdk_<PLATFORM>_<SDK_VERSION>.tar.xz

2.1. SDK directory structure

The SDK directory structure is as follows:

├── ReleaseNote.md
├── include
│   ├── c_utils
│   │   ├── c_utils.h
│   │   └── ...
│   ├── c_mmi.h
│   ├── lib_c_license.h
│   ├── lib_c_sdk.h
│   ├── libqwen_sdk.h
│   ├── qwen_test.h
│   └── ...
├── libs
│   ├── libc_license.a
│   ├── libc_mmi_cmd.a
│   ├── libhal_dummy.a
│   ├── libqwen_sdk.a
│   ├── libqwen_test.a
│   └── ...
└── third_party
    ├── cJSON
    │   ├── cJSON.h
    │   └── libcjson.a
    └── tinycrypt
        ├── include
        │   └── ...
        └── libtinycrypt.a

Description:

  • The `include` directory contains the header files required to use the SDK. Add this directory to the header file path of your project.

  • `libqwen_sdk.a` contains the core code of the SDK. You must load this library.

  • `libc_license.a` contains code related to the license mode. You must load this library if you use the license mode. Do not load this library if you use the pay-as-you-go mode.

  • `libhal_dummy.a` contains dummy Hardware Abstraction Layer (HAL) code. Use it to check the compilation environment before you adapt the HAL. We recommend that you delete this library after the HAL porting is complete.

  • `libqwen_test.a` contains test code. Load this library for automated testing. Remove it for production.

  • `libtinycrypt.a` contains dependencies for encryption and decryption. You must load this library if your platform has not already integrated this third-party library.

  • `libcjson.a` contains SDK dependencies. You must load this library if your platform has not already integrated this third-party library.

2.2. HAL layer adaptation

You must implement the functions declared in the five header files from the SDK package on your development platform.

aliyun_sdk/include/c_utils/
    ├──hal_util_mem.h
    ├──hal_util_mutex.h
    ├──hal_util_random.h
    ├──hal_util_storage.h
    ├──hal_util_time.h
    └──...

You must adapt the following functions. Otherwise, the SDK will not work correctly.

The functions that you need to implement are listed below. For detailed descriptions of these functions, see the corresponding header files in the SDK package.

  • Memory module (aliyun_sdk/include/c_utils/hal_util_mem.h)

void * util_malloc(int32_t size);
void util_free(void *ptr);
  • Random number module (aliyun_sdk/include/c_utils/hal_util_random.h)

int32_t util_random_init(uint32_t seed);
uint32_t util_random(void);
  • Storage module (aliyun_sdk/include/c_utils/hal_util_storage.h)

int32_t util_storage_erase(void);
int32_t util_storage_storage(uint8_t *data, uint32_t size);
int32_t util_storage_load(uint8_t *data, uint32_t size);
  • Time module (aliyun_sdk/include/c_utils/hal_util_time.h)

void util_msleep(uint32_t ms);
int64_t util_get_timestamp_ms(void);
uint8_t util_timestamp_inited(void);
  • Mutex module (aliyun_sdk/include/c_utils/hal_util_mutex.h)

/* Mutex struct definition */
typedef struct _util_mutex_t {
    void *mutex_handle; /* Mutex handle. The specific implementation depends on the platform. */
} util_mutex_t;

util_mutex_t * util_mutex_create(void);
void util_mutex_delete(util_mutex_t *mutex);
int32_t util_mutex_lock(util_mutex_t *mutex, int32_t timeout);
int32_t util_mutex_unlock(util_mutex_t *mutex);

2.3. HAL layer porting acceptance criteria

After you implement the preceding modules, load libsdk_test.a and call the aliyun_sdk_test() function in your main program to test the modules. You can view the test results in the output logs.

Provide the output logs to Alibaba Cloud for confirmation.

The following log output shows a successful test:

[UT][I][aliyun_sdk_test]********************* Hal Test Start *********************
[UT][I][aliyun_sdk_test]********************* memory test done *******************
[UT][I][aliyun_sdk_test]time is 1753344478377
[UT][I][aliyun_sdk_test]********************* time test done *********************
[UT][I][aliyun_sdk_test]********************* storage test done ******************
[UT][I][aliyun_sdk_test]********************* random test done *******************
[UT][I][aliyun_sdk_test]********************* mutex test done ********************
[UT][I][aliyun_sdk_test]********************* Hal Test End ***********************

3. Device initialization

Before using the SDK, go to the Alibaba Cloud Model Studio multimodal console to create an application and obtain an App ID and API key.

  • For the license mode, you must also purchase a license to obtain the corresponding AppSecret.

The following code shows an example of initialization for the fully managed license mode:

#include "c_mmi.h"
#include "lib_c_license.h"

int dummy_aliyun_sdk_init(void)
{
    // Initialize the SDK.
    c_mmi_sdk_init();

    if (c_license_device_is_registered() == 0) {
        c_mmi_storage_reset();
        // Pre-configure the AppId.
        c_mmi_storage_set_app_id_str("Your AppId");
        // Pre-configure the AppSecret.
        c_license_set_app_secret_str("Your AppSecret");
        // Pre-configure the DeviceName.
        c_mmi_set_device_name("Your DeviceName");
        // Save the configuration.
        c_mmi_storage_save();
    }
    // Pre-configure the API key. You must configure the API key for the fully managed mode. This is not required for the semi-managed mode.
    // You must reconfigure the API key every time the device starts.
    c_mmi_storage_set_api_key("Your ApiKey");

    mmi_user_config_t mmi_config = C_MMI_CONFIG_DEFAULT();


    // You must configure evt_cb. Otherwise, the SDK may run abnormally.
    mmi_config.evt_cb = _mmi_event_callback;		// Register the event callback function. For more information, see the following sections.
    // Configure the working mode.
    mmi_config.work_mode = C_MMI_MODE_PUSH2TALK;
    mmi_config.text_mode = C_MMI_TEXT_MODE_BOTH;
    // Configure the upstream and downstream audio data formats.
    mmi_config.upstream_mode = C_MMI_STREAM_MODE_PCM;
    mmi_config.downstream_mode = C_MMI_STREAM_MODE_MP3;
    // Configure the buffer size.
    mmi_config.recorder_rb_size = 8 * 1024;
    mmi_config.player_rb_size = 8 * 1024;

    c_mmi_config(&mmi_config);
    
    //... Other code
}

Note: The DeviceName is a unique identifier for each device. You can specify a custom value for this parameter. The value must not exceed 32 characters. We recommend using unique device information such as a MAC address or International Mobile Equipment Identity (IMEI).

Reference logs

[c_license_reset]ver[0x00010100][1.1.0] done
[UT][I][c_mmi_storage_save]stroage [176]
[UT][I][c_mmi_storage_save]ver[0x00010100][1.1.0] done
[UT][E][c_mmi_storage_init]reset done
[UT][W][c_mmi_sdk_init]license enable
[LICENSE][W][c_license_device_is_registered]load flag 0x00000000/0x0000001d
[LICENSE][I][c_license_reset]ver[0x00010100][1.1.0] done
[LICENSE][I][c_license_set_app_id_str]app_id    [Your AppId]
[LICENSE][I][c_license_set_app_secret_str]app_secret    [00000000000000000000000000000000]
[LICENSE][I][c_license_set_device_name]device_name      [Your DeviceName]
[UT][I][c_mmi_set_device_name]device_name [Your DeviceName]
[UT][D][_get_storage_path]path [/Users/lancelot/Desktop/code/esp32_v3/qwen_sdk/build_mac/device_data.bin]
[UT][I][c_mmi_storage_save]stroage [176]
[UT][I][c_mmi_storage_save]ver[0x00010100][1.1.0] done
[UT][I][c_mmi_storage_set_api_key]app_id    [Your ApiKey]
[UT][I][c_mmi_config]device_name [Your DeviceName]
[UT][I][c_mmi_config]load dialog_id []
[UT][I][c_mmi_config]done
[UT][I][c_mmi_config_print]>>>>>>>>>>> mmi config start <<<<<<<<<<<
[UT][I][c_mmi_config_print]device_name [Your DeviceName]
[UT][I][c_mmi_config_print]dialog_id []
[UT][I][c_mmi_config_print]event_callback [0x10264ee18]
[UT][I][c_mmi_config_print]work_mode [push2talk]
[UT][I][c_mmi_config_print]text_mode [transcript,dialog]
[UT][I][c_mmi_config_print]voice_id [longxiaochun_v2]
[UT][I][c_mmi_config_print]story_voice_id [longxiaochun_v2]
[UT][I][c_mmi_config_print]upstream_mode [pcm]
[UT][I][c_mmi_config_print]downstream_mode [mp3]
[UT][I][c_mmi_config_print]recorder_rb_size [8192]
[UT][I][c_mmi_config_print]player_rb_size [8192]
[UT][I][c_mmi_config_print]transmit_rate_limit [0]
[UT][I][c_mmi_config_print]enable_cbr [0]
[UT][I][c_mmi_config_print]frame_size [60]
[UT][I][c_mmi_config_print]bit_rate [32]
[UT][I][c_mmi_config_print]us_sample_rate [24000]
[UT][I][c_mmi_config_print]ds_sample_rate [24000]
[UT][I][c_mmi_config_print]vocabulary_id [NULL]
[UT][I][c_mmi_config_print]volume [50]
[UT][I][c_mmi_config_print]speech_rate [100]
[UT][I][c_mmi_config_print]pitch_rate [100]
[UT][I][c_mmi_config_print]>>>>>>>>>>> mmi config end <<<<<<<<<<<

4. Build device-side network communication

The SDK uses the HTTP and WebSocket protocols. You must implement the corresponding processes for sending and receiving data. The SDK only encapsulates data to be sent and parses received data.

4.1. HTTP communication

This document simplifies HTTP communication into two functions: one for sending and one for receiving. You must adapt the implementation to your platform. The following sample code only illustrates the interaction flow.

// HTTP send function
int dummy_http_request(char* method, char* host, char* api, char *port, char* header, \
                         char* body, int(*rsp_async_cb)(char* rsp_data, int rsp_len));

// HTTP receive function to implement device registration
int dummy_http_response_for_register(char* rsp_data, int rsp_len);

// HTTP receive function to implement token acquisition
int dummy_http_response_for_token(char* rsp_data, int rsp_len);

4.1.1. Device registration

The following code shows an example of device registration:

int dummy_http_response_for_register(int status, char* rsp_data, int rsp_len)
{
    int err;

    // ... Other business logic

    // Parse the response and complete registration.
    // The parsing function in the SDK only requires the data field from the message, as shown in the following HTTP response message.
    err = c_license_analyze_register_rsp(rsp_body_data);
    c_mmi_storage_save();
    
    // ... Other business logic

    return err;
}

int dummy_aliyun_sdk_init(void)
{
    // Initialize the SDK.
    c_mmi_sdk_init();

    if (c_license_device_is_registered() == 0) {
        c_mmi_storage_reset();
        // Pre-configure the AppId.
        c_mmi_storage_set_app_id_str("Your AppId");
        // Pre-configure the AppSecret.
        c_license_set_app_secret_str("Your AppSecret");
        // Pre-configure the DeviceName.
        c_mmi_set_device_name("Your DeviceName");
        // Save the configuration.
        c_mmi_storage_save();
    }
    // Pre-configure the API key. You must configure the API key for the fully managed mode. This is not required for the semi-managed mode.
    // You must reconfigure the API key every time the device starts.
    c_mmi_storage_set_api_key("Your ApiKey");

    mmi_user_config_t mmi_config = C_MMI_CONFIG_DEFAULT();


    // You must configure evt_cb. Otherwise, the SDK may run abnormally.
    mmi_config.evt_cb = _mmi_event_callback;		// Register the event callback function. For more information, see the following sections.
    // Configure the working mode.
    mmi_config.work_mode = C_MMI_MODE_PUSH2TALK;
    mmi_config.text_mode = C_MMI_TEXT_MODE_BOTH;
    // Configure the upstream and downstream audio data formats.
    mmi_config.upstream_mode = C_MMI_STREAM_MODE_PCM;
    mmi_config.downstream_mode = C_MMI_STREAM_MODE_MP3;
    // Configure the buffer size.
    mmi_config.recorder_rb_size = 8 * 1024;
    mmi_config.player_rb_size = 8 * 1024;

    c_mmi_config(&mmi_config);
    
    // Register the device.
    if (c_license_device_is_registered() == 0) {
        char time_ms_str[C_UTIL_TIMESTAMP_MS_LEN + 1];
        snprintf(time_ms_str, sizeof(time_ms_str), "%" PRId64, util_get_timestamp());
        // Generate the registration request string `req` based on the current timestamp_str (in milliseconds).
        c_device_gen_register_req(req, timestamp_str);
        // Obtain the device information returned by the server.
        dummy_http_request(METHOD, HOST, API, PORT, HEADER, req, dummy_http_response_for_register);
    }

    //... Other code. For a complete example, see the following sections.
}

In the preceding code:

  • If you use the semi-managed mode, configure the HTTP communication parameters based on the registration API of the server-side service that you developed.

  • If you use the fully managed mode, Alibaba Cloud provides the HTTP communication parameters. The access information is as follows:

    Host: bailian.multimodalagent.aliyuncs.com

    Register API: /api/device/v1/register


Sample data

The following code shows a sample request packet generated by the device-side SDK. This packet is the body field of the HTTP request message.

{
    "appId": "Your AppId",
    "deviceName": "Your DeviceName",
    "nonce": "Nonce",
    "requestTime": "Time",
    "sdkVersion": "1.1.0",
    "signature": "Signature"
}

The following code shows the data field of an HTTP response message, which is a registration information packet parsed by a device.

{
    "appId": "Your AppId",
    "deviceName": "Your DeviceName",
    "nonce": "Nonce",
    "requestTime": "Time",
    "sdkVersion": "1.1.0",
    "signature": "Signature"
}

Note: When you call c_license_analyze_register_rsp, the data for the input parameter must match the format of the sample data.

The following is the debug log:

[LICENSE][I][c_license_gen_register_str]req_str [356][{"appId":"mm_Your AppId","deviceName":"Your DeviceName","nonce":"Nonce","requestTime":"Time","sdkVersion":"1.1.0","signature":"Signature"}]
[LICENSE][I][c_license_analyze_register_rsp]rsp_str [376][{"nonce":"Nonce","responseTime":"Time","appId":"mm_Your AppId","deviceName":"Your DeviceName","signature":"Signature"}]
[LICENSE][I][c_license_analyze_register_rsp]nonce       [Nonce]

The following example shows a complete HTTP request message:

POST /api/device/v1/register HTTP/1.1
Host: bailian.multimodalagent.aliyuncs.com
Accept: */*
Content-Type: application/json
Content-Length: 356

{"appId":"Your AppId","deviceName":"Your DeviceName","nonce":"Nonce","requestTime":"Time","sdkVersion":"1.1.0","signature":"Signature"}

The complete received HTTP message is as follows:

HTTP/1.1 200 OK
content-type: application/json
date: Wed, 07 Jan 2026 11:38:25 GMT
req-cost-time: 28
req-arrive-time: 1767785905379
resp-start-time: 1767785905407
x-envoy-upstream-service-time: 27
server: istio-envoy
x-request-id: efa2b37a-4acc-473d-8d86-bfc87ad52821
transfer-encoding: chunked

1ea
{"code":200,"success":true,"message":"success","localizedMsg":null,"data":{"nonce":"Nonce","responseTime":"Time","appId":"Your AppId","deviceName":"Your DeviceName","signature":"Signature"}
0

4.1.2. Device logon

Each time the device connects to the Alibaba Cloud multimodal interactive AI application, it must first obtain a token. The following code shows an example of how to obtain a token:

int dummy_http_response_for_token(int status, char* rsp_data, int rsp_len)
{
    int err;

    // ... Other business logic

    // Parse the response and obtain the token. The data requirements are the same as above.
    err = c_license_analyze_get_token_rsp(rsp_data);
    
    // ... Other business logic

    return err;
}

int dummy_aliyun_sdk_init(void)
{
    // Initialize the SDK.
    c_mmi_sdk_init();

    if (c_license_device_is_registered() == 0) {
        c_mmi_storage_reset();
        // Pre-configure the AppId.
        c_mmi_storage_set_app_id_str("Your AppId");
        // Pre-configure the AppSecret.
        c_license_set_app_secret_str("Your AppSecret");
        // Pre-configure the DeviceName.
        c_mmi_set_device_name("Your DeviceName");
        // Save the configuration.
        c_mmi_storage_save();
    }
    // Pre-configure the API key. You must configure the API key for the fully managed mode. This is not required for the semi-managed mode.
    // You must reconfigure the API key every time the device starts.
    c_mmi_storage_set_api_key("Your ApiKey");

    mmi_user_config_t mmi_config = C_MMI_CONFIG_DEFAULT();


    // You must configure evt_cb. Otherwise, the SDK may run abnormally.
    mmi_config.evt_cb = _mmi_event_callback;		// Register the event callback function. For more information, see the following sections.
    // Configure the working mode.
    mmi_config.work_mode = C_MMI_MODE_PUSH2TALK;
    mmi_config.text_mode = C_MMI_TEXT_MODE_BOTH;
    // Configure the upstream and downstream audio data formats.
    mmi_config.upstream_mode = C_MMI_STREAM_MODE_PCM;
    mmi_config.downstream_mode = C_MMI_STREAM_MODE_MP3;
    // Configure the buffer size.
    mmi_config.recorder_rb_size = 8 * 1024;
    mmi_config.player_rb_size = 8 * 1024;

    c_mmi_config(&mmi_config);
    
    // Register the device.
    if (c_license_device_is_registered() == 0) {
        char time_ms_str[C_UTIL_TIMESTAMP_MS_LEN + 1];
        snprintf(time_ms_str, sizeof(time_ms_str), "%" PRId64, util_get_timestamp());
        // Generate the registration request string `req` based on the current timestamp_str (in milliseconds).
        c_device_gen_register_req(req, timestamp_str);
        // Obtain the device information returned by the server.
        dummy_http_request(METHOD, HOST, API, PORT, HEADER, req, dummy_http_response_for_register);
    }

    // Log on to the device.
    if (c_license_is_token_expire() == 0) {
        char time_ms_str[C_UTIL_TIMESTAMP_MS_LEN + 1];
        char api_key[C_MMI_API_KEY_LEN + 1] = { 0 };
        int32_t ret;

        snprintf(time_ms_str, sizeof(time_ms_str), "%" PRId64, util_get_timestamp());
        // Generate the logon request string `req` based on the current timestamp_str (in milliseconds).
        ret = c_mmi_storage_get_api_key(api_key);
        if (ret == UTIL_SUCCESS) {
            c_license_gen_get_token_str(req_str, sizeof(req_str), time_ms_str, api_key);
        } else {
            c_license_gen_get_token_str(req_str, sizeof(req_str), time_ms_str, NULL);
        }
        // Obtain the logon information returned by the server.
        dummy_http_request(METHOD, HOST, API, PORT, HEADER, req_str, dummy_http_response_for_token);
    }

    //... Other code. For a complete example, see the following sections.
}

In the preceding code:

  • If you use the semi-managed mode, configure the HTTP communication parameters based on the device logon (getToken) API of the server-side service that you developed.

  • If you use the fully managed mode, Alibaba Cloud provides the HTTP communication parameters. The access information is as follows:

    Host: bailian.multimodalagent.aliyuncs.com

    Register API: /api/token/v1/getToken


4.1.3. Sample data

The following code shows a sample request packet generated by the device-side SDK. This packet is the body field of the HTTP request message.

{
    "appId": "Your AppId",
    "deviceName": "Your DeviceName",
    "nonce": "Nonce",
    "requestTime": "Time",
    "sdkVersion": "1.1.0",
    "tokenType": "MMI",
    "signature": "Signature"
}

The following code shows a sample registration information packet that the device parses. This packet is the data field of the HTTP response message.

{
    "nonce": "Nonce",
    "responseTime": "Time",
    "appId": "Your AppId",
    "deviceName": "Your DeviceName",
    "requestIp": "Your IP",
    "signature": "Signature"
}

Note: When you call c_license_analyze_get_token_rsp, the input parameter data must have the same format as the sample data.

The following log shows the debugging information:

[LICENSE][I][_gen_get_token_str]plaintext [201][{"apiKey":"Your ApiKey","appId":"Your AppId","deviceName":"Your DeviceName","payMode":"LICENSE","requestTime":"Time","sdkVersion":"1.1.0","tokenType":"MMI"}]
[LICENSE][I][_gen_get_token_str]req_str [462][{"appId":"Your AppId","deviceName":"Your DeviceName","nonce":"Nonce","requestTime":"Time","sdkVersion":"1.1.0","tokenType":"MMI","signature":"Signature"}]
[LICENSE][I][c_license_analyze_get_token_rsp]rsp_str [603][{"nonce":"Nonce","responseTime":"Time","appId":"Your AppId","deviceName":"Your DeviceName","requestIp":"Your IP","signature":"Signature"}]
[LICENSE][I][c_license_analyze_get_token_rsp]nonce      [Nonce]

The following code shows a complete HTTP request message:

POST /api/token/v1/getToken HTTP/1.1
Host: bailian.multimodalagent.aliyuncs.com
Accept: */*
Content-Type: application/json
Content-Length: 462

{"appId":"Your AppId","deviceName":"Your DeviceName","nonce":"Nonce","requestTime":"Time","sdkVersion":"1.1.0","tokenType":"MMI","signature":"Signature"}

The following code shows a complete HTTP response message:

HTTP/1.1 200 OK
content-type: application/json
date: Wed, 07 Jan 2026 11:38:24 GMT
req-cost-time: 54
req-arrive-time: 1767785905624
resp-start-time: 1767785905678
x-envoy-upstream-service-time: 39
server: istio-envoy
x-request-id: da621d59-5911-44ac-a7be-0e5977de40a3
transfer-encoding: chunked

2cd
{"code":200,"success":true,"message":"success","localizedMsg":null,"data":{"nonce":"Nonce","responseTime":"Time","appId":"Your AppId","deviceName":"Your DeviceName","requestIp":"Your IP","signature":"Signature"},"requestId":"RequestId"}
0

4.2. WebSocket communication

After you complete the device logon integration, start debugging the WebSocket communication.

The Alibaba Cloud Model Studio multimodal interactive SDK only processes WebSocket data. It does not handle the sending or receiving of data. We recommend that you create separate threads for sending and receiving WebSocket data. The sample code in this document is based on this interaction method.

WebSocket sample code

// Establish a WebSocket connection.
int dummy_wss_connect(char* host, char* port, char* api, char* header);

// Start the WebSocket sending and receiving threads.
int dummy_wss_thread_start(void* params);

// The actual WebSocket send function.
int dummy_wss_send(int data_type, char* payload_data, int size);

// The actual WebSocket receive function.
int dummy_wss_recv(int* opcode, char* payload_data, int* recv_size);

// This function sends data asynchronously based on the WebSocket protocol.
int dummy_wss_task_send(void);

// This function receives data asynchronously based on the WebSocket protocol.
int dummy_wss_task_recv(void);

4.2.1. Establish a WebSocket connection

The following sample code describes how to establish a WebSocket connection. The SDK provides the host, port, api, and header fields. You must fill in the remaining fields, package the data, and send the request.

int dummy_wss_init(void)
{ 
    // Establish a WebSocket connection and obtain the connection fields.
    char *wss_host = c_mmi_get_wss_host();
    char *wss_port = c_mmi_get_wss_port();
    char *wss_api = c_mmi_get_wss_api();
    char *wss_header = c_mmi_get_wss_header();

    UTIL_LOG_I("work");
    // Establish a WSS connection.
    int ret = dummy_wss_connect(wss_host, wss_port, wss_api, wss_header);
    
    return ret;

}

Note that the WebSocket communication between this SDK and the cloud requires a Transport Layer Security (TLS) tunnel. Configure the following settings:

  • TLS version: TLS 1.2 or later

  • Enable Server Name Indication (SNI).

  • Configure the CA certificate (GlobalSign Root CA - R3). You can also download it from the official GlobalSign website.


The following code shows a sample upgrade request message for establishing a WebSocket connection:

GET <WSS API> HTTP/1.1
Host: <WSS HOST>
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: FhVlQeR4S1N06+1/SU79XA== 
Sec-WebSocket-Version: 13
<WSS HEADER>

The following code shows a sample response message for establishing a WebSocket connection:

HTTP/1.1 101 Switching Protocols
upgrade: websocket
connection: upgrade
sec-websocket-accept: sqchBdVDX8kKBgi90/PFl5+/4VI=
date: Thu, 24 Jul 2025 08:25:24 GMT
server: istio-envoy

Log example

[UT][I][dummy_wss_init]work
[UT][I][dummy_wss_connect]wss update
[UT][I][dummy_wss_connect]request[239][GET <wss_api> HTTP/1.1
Host: <WSS HOST>
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: FhVlQeR4S1N06+1/SU79XA==
Sec-WebSocket-Version: 13
<WSS HEADER>

]
[UT][I][dummy_wss_connect]Reading response...
[UT][I][dummy_wss_connect]response[MMI][187][HTTP/1.1 101 Switching Protocols
upgrade: websocket
connection: upgrade
sec-websocket-accept: sqchBdVDX8kKBgi90/PFl5+/4VI=
date: Thu, 24 Jul 2025 08:25:24 GMT
server: istio-envoy

]
[UT][I][dummy_wss_connect][MMI]done

After you correctly establish the WebSocket connection, you can view the corresponding logs on the platform. In a normal scenario, the connection is maintained for one minute before the server proactively disconnects it.

4.2.2. WebSocket data interaction

When using the Alibaba Cloud Model Studio multimodal interactive SDK, all data exchanged over WebSocket must be processed by the SDK. Otherwise, unexpected issues may occur.

The following sample code describes how to implement WebSocket data interaction by creating threads.

int dummy_wss_task_recv(void)
{
    int opcode;
    char data[8 * 1024];
    int recv_size;

    while(1) {
        // Receive downstream data from the server over WebSocket.
        dummy_wss_recv(&opcode, data, &recv_size);
        if(recv_size) {
            // Send the received downstream data to the SDK for parsing.
            c_mmi_analyze_recv_data(opcode, _data, recv_size);
        } else {
            util_msleep(10);
        }
    }
    return 0;
}

int dummy_wss_task_send(void)
{
    uint8_t opcode;
    uint8_t data[8 * 1024];
    uint32_t size;

    while(1) {
        // Obtain the packaged payload data from the SDK.
        size = c_mmi_get_send_data(&opcode, data, sizeof(payload_data));
        if (size == 0) {
            util_msleep(10);
        } else {
            // Pass the payload data to the WebSocket send function and package the frame header for sending.
            dummy_wss_send(opcode, data, size);
        }
    }
    return 0;
}

Notes:

  • The recommended buffer size for the data received over WebSocket varies depending on the downstream data format:

    • For downstream PCM data, we recommend a buffer size of 8 KB or larger.

    • For downstream MP3 data, we recommend a buffer size of 8 KB or larger.

    • For downstream Opus data, we recommend a buffer size of 4 KB or larger.

  • The buffer size for the data sent over WebSocket must not be less than 1.5 KB.

  • When you call the c_mmi_analyze_recv_data or c_mmi_get_send_data function, the opcode is WS_DATA_TYPE_TEXT or WS_DATA_TYPE_BINARY. This value is defined in the SDK header file. To understand this value, refer to the specific platform definitions and the WebSocket protocol.

After you complete the data integration, you can see the following logs:

[UT][I][dummy_wss_thread_start][dummy_wss_task_send] send task [0x135e27180]
[UT][I][dummy_wss_thread_start][dummy_wss_task_recv] recv task [0x135e271a0]
[UT][I][_gen_cmd_start]task_id [<TASK ID>]
[UT][I][_send_cmd_start]send [run-task] [0-2] [0]
[UT][D][c_mmi_analyze_recv_data]recv[109][{"header":{"task_id":"<TASK ID>","event":"task-started","attributes":{}},"payload":{}}]
[UT][I][c_mmi_analyze_recv_data]recv [task-started] [0-2]
[UT][D][c_mmi_analyze_recv_data]recv[192][{"header":{"task_id":"<TASK ID>","event":"result-generated","attributes":{}},"payload":{"output":{"event":"Started","dialog_id":"<DIALOG_ID>"}}}]
[UT][I][_on_payload_event_start]recv [Started] [0-3]
[UT][D][c_mmi_analyze_recv_data]recv[223][{"header":{"task_id":"<TASK ID>","event":"result-generated","attributes":{}},"payload":{"output":{"event":"DialogStateChanged","state":"Listening","dialog_id":"<DIALOG_ID>"}}}]
[UT][I][_on_payload_event_state_change]recv [Listening] [0-4]

5. Develop the voice interaction flow

5.1. Audio data interfaces

The audio data interfaces primarily involve implementing functions for the microphone and audio playback. You must provide these implementations.

The following code is for demonstration only. The implementation and business logic are for reference only.

// After the recorder starts, the microphone begins to acquire data asynchronously.
int dummy_recorder_async_callback(void);
// Start microphone recording.
int dummy_recorder_start(void);
// Stop microphone recording.
int dummy_recorder_stop(void);
// Get data from the hardware microphone.
int dummy_hw_recorder_get_data(uint8_t* data, uint32_t size);
// Get the working status of the microphone.
int dummy_recoder_is_work(void);

// After the player starts, it begins to play audio asynchronously.
int dummy_player_async_callback(void);
// Start speaker playback.
int dummy_player_start(void);
// Stop speaker playback.
int dummy_player_stop(void);
// Put data into the speaker hardware for playback.
int dummy_hw_player_put_data(uint8_t* data, uint32_t size);
// Get the working status of the speaker.
int dummy_palyer_is_work(void);

This document uses multiple threads to implement audio playback and microphone recording. The following example is structured accordingly:

void dummy_recorder_task(void)
{
    uint32_t send_size =0;
    uint32_t size = 640;
    uint8_t* data = (uint8_t*) util_malloc(size);

    while(1) {
        if(dummy_recoder_is_work()) {
            send_size = dummy_hw_recorder_get_data(data, size);
            if(send_size){
                // Output the data collected by the audio capture hardware to the SDK ring buffer.
                c_mmi_put_recorder_data(data, send_size);
            } else {
                util_msleep(10);
            }
        
        } else {
            util_msleep(10);
        }
    }
}

void dummy_player_task(void)
{
    uint8_t data[640];
    uint32_t size = 640;
    uint32_t recv_size = 0;

    while(1) {
        if (dummy_player_is_work()) {
            recv_size = c_mmi_get_player_data(data, size);
            if(recv_size){
                // Output the audio data from the SDK ring buffer to the player for playback.
                dummy_hw_player_put_data(data, size)
            } else {
                util_msleep(10);
            }
        } else {
            util_msleep(10);
        }
    }
}

5.2. Button interfaces

The example in this document uses a push-to-talk implementation. Therefore, you need to capture two events: button press and button release. This document uses an event callback method and provides only the callback function. You must implement the underlying logic.

The following code is for demonstration only:

// Triggered when the button is released.
int dummy_button_up(void);
// Triggered when the button is pressed.
int dummy_button_down(void);

The following code shows an example of implementation and invocation:

int dummy_button_up(void)
{
    // Turn off the microphone.
    dummy_recorder_stop();
    // Notify the cloud service to start processing audio data.
    c_mmi_stop_speech();

    return 0;
}

int dummy_button_down(void)
{
    // Turn off the speaker.
    dummy_player_stop();
    // Notify the cloud that audio data will be sent soon. The SDK triggers C_MMI_EVENT_SPEECH_START based on the cloud's response to this instruction.
    c_mmi_start_speech();

    return 0;
}

5.3. Event callbacks

The event callbacks related to voice interaction in the Alibaba Cloud Model Studio multimodal interactive SDK are as follows:

enum {
    C_MMI_EVENT_USER_CONFIG,        // User configurations for the SDK, such as audio buffer size, working mode, and timbre, should be implemented in this event callback.
    C_MMI_EVENT_DATA_INIT,	        // This event is triggered after the SDK is initialized. You can start establishing business connections in this event callback.
    C_MMI_EVENT_SPEECH_READY,	    // This event is triggered after a WSS connection is correctly established. In push and tap modes, you can call speech start only after this event.
    C_MMI_EVENT_SPEECH_PREPARE,     // This event is triggered when the SDK is ready to start a new round of conversation.
    C_MMI_EVENT_SPEECH_START,       // This event is triggered when the SDK starts audio upstream.
    C_MMI_EVENT_SPEECH_RESTART,     // This event is triggered when the SDK restarts audio upstream.
    C_MMI_EVENT_DATA_DEINIT,	    // This event is triggered after the SDK is unregistered.

    C_MMI_EVENT_ASR_START,	        // This event is triggered when ASR starts returning data.
    C_MMI_EVENT_ASR_INCOMPLETE,	    // This event returns the incomplete ASR text data (full text).
    C_MMI_EVENT_ASR_COMPLETE,	    // This event returns all the completed ASR text data (full text).
    C_MMI_EVENT_ASR_END,		    // This event is triggered when ASR ends.

    C_MMI_EVENT_LLM_INCOMPLETE,	    // This event returns the incomplete LLM text data (full text).
    C_MMI_EVENT_LLM_COMPLETE,	    // This event returns all the completed LLM text data (full text).

    C_MMI_EVENT_TTS_START,	        // This event is triggered when audio downstream starts.
    C_MMI_EVENT_TTS_END,	        // This event is triggered when audio downstream is complete.

    C_MMI_EVENT_HEARTBEAT,	        // This event is triggered when the SDK receives a heartbeat response from the cloud.
};

The following code shows a reference example:

int _mmi_event_callback(uint32_t event, void *param)
{
    char *text;

    text = param;
    switch (event) {
        case C_MMI_EVENT_USER_CONFIG:
            // Start a new conversation.
            c_mmi_reset_dialog_id();
            break;
        case C_MMI_EVENT_DATA_INIT:
            // MMI data is ready. Start the network connection.
            dummy_wss_init();
            break;
        case C_MMI_EVENT_DATA_DEINIT:
            UTIL_LOG_W("will disconnect");
            break;
        case C_MMI_EVENT_SPEECH_START:
            UTIL_LOG_D("enable recorder when send speech");
            dummy_player_stop();
            dummy_recorder_start();
            break;
        case C_MMI_EVENT_ASR_START:
            UTIL_LOG_I("event [C_MMI_EVENT_ASR_START]");
            break;
        case C_MMI_EVENT_ASR_INCOMPLETE:
            UTIL_LOG_D("ASR [%s]", text);
            break;
        case C_MMI_EVENT_ASR_COMPLETE:
            if (text) {
                UTIL_LOG_D("ASR C [%s]", text);
            } else {
                UTIL_LOG_D("ASR C [NULL]");
            }
            break;
        case C_MMI_EVENT_ASR_END:
            UTIL_LOG_D("disable record when ASR complete");
            dummy_recorder_stop();
            break;
        case C_MMI_EVENT_LLM_INCOMPLETE:
            UTIL_LOG_D("LLM [%s]", text);
            break;
        case C_MMI_EVENT_LLM_COMPLETE:
            UTIL_LOG_D("LLM C [%s]", text);
            break;
        case C_MMI_EVENT_TTS_START:
            UTIL_LOG_I("enable player when dialog start");
            dummy_player_start();
            break;
        case C_MMI_EVENT_TTS_END:
            break;
        default:
            break;
    }

    return UTIL_SUCCESS;
}

The following log shows an example:

[UT][I][_on_payload_event_state_change]recv [Listening] [0-4]
[UT][D][dummy_button_down]
[UT][I][dummy_player_stop]
[UT][I][_send_cmd_req2spk]ready to send [1-4]
[UT][I][_send_cmd_speech]send [SendSpeech] [1-5] [0]
[UT][D][dummy_mmi_event_callback]enable recorder when send speech
[UT][I][dummy_recorder_start]
[UT][I][_on_payload_event_speech_start]recv [SpeechStarted][ASR Start] [1-5]
[UT][I][dummy_mmi_event_callback]event [C_MMI_EVENT_ASR_START]
[UT][D][dummy_button_up]
[UT][I][dummy_recorder_stop]
[UT][I][_send_cmd_stop_speech]send [StopSpeech] [1-5] [0]
[UT][I][_on_payload_event_speech_content]recv [SpeechContent][ASR Text] [1-5]
[UT][D][dummy_mmi_event_callback]ASR C [What's the weather like today?]
[UT][I][_on_payload_event_speech_end]recv [SpeechEnded][ASR End] [1-6]
[UT][D][dummy_mmi_event_callback]disable record when ASR complete
[UT][I][dummy_recorder_stop]
[UT][I][_on_payload_event_state_change]recv [Thinking] [1-7]
[UT][D][_on_payload_event_state_change]prepare player rb
[UT][I][_on_payload_event_state_change]recv [Responding][Audio Start] [1-8]
[UT][I][dummy_mmi_event_callback]enable player when dialog start
[UT][I][dummy_player_start]
[UT][I][_on_payload_event_respond_start]recv [RespondingStarted][Audio Start] [1-8]
[UT][I][_on_payload_event_respond_content]recv [RespondingContent][LLM Text] [1-8]
[UT][D][dummy_mmi_event_callback]LLM C [Today in Shanghai, it is cloudy with a high of 33°C. Expect light rain tonight with a low of 27°C. The wind is from the east at Force 1 to 3.]
[UT][I][_on_payload_event_respond_end]recv [RespondingEnded][Audio End] [1-9]
[UT][D][_on_payload_event_respond_end]recv audio data size [371200]

6. Complete SDK build process

6.1. Complete code example

The following code shows a complete example:

#include "lib_c_license.h"
#include "lib_c_mmi.h"

#define C_SDK_REQ_LEN_REGISTER	500
char req[C_SDK_REQ_LEN_REGISTER];

int dummy_http_response_for_register(int status, char* rsp_data, int rsp_len)
{
    int err;

    // ... Other business logic

    // Parse the response and complete registration.
    // The parsing function in the SDK only requires the data field from the message, as shown in the following HTTP response message.
    err = c_license_analyze_register_rsp(rsp_body_data);
    c_mmi_storage_save();
    
    // ... Other business logic

    return err;
}

int dummy_http_response_for_token(int status, char* rsp_data, int rsp_len)
{
    int err;

    // ... Other business logic

    // Parse the response and obtain the token. The data requirements are the same as above.
    err = c_license_analyze_get_token_rsp(rsp_data);
    
    // ... Other business logic

    return err;
}


int dummy_aliyun_sdk_init(void)
{
    // Initialize the SDK.
    c_mmi_sdk_init();

    if (c_license_device_is_registered() == 0) {
        c_mmi_storage_reset();
        // Pre-configure the AppId.
        c_mmi_storage_set_app_id_str("Your AppId");
        // Pre-configure the AppSecret.
        c_license_set_app_secret_str("Your AppSecret");
        // Pre-configure the DeviceName.
        c_mmi_set_device_name("Your DeviceName");
        // Save the configuration.
        c_mmi_storage_save();
    }
    // Pre-configure the API key. You must configure the API key for the fully managed mode. This is not required for the semi-managed mode.
    // You must reconfigure the API key every time the device starts.
    c_mmi_storage_set_api_key("Your ApiKey");

    mmi_user_config_t mmi_config = C_MMI_CONFIG_DEFAULT();


    // You must configure evt_cb. Otherwise, the SDK may run abnormally.
    mmi_config.evt_cb = _mmi_event_callback;    // Register the event callback function. For more information, see the following sections.
    // Configure the working mode.
    mmi_config.work_mode = C_MMI_MODE_PUSH2TALK;
    mmi_config.text_mode = C_MMI_TEXT_MODE_BOTH;
    // Configure the upstream and downstream audio data formats.
    mmi_config.upstream_mode = C_MMI_STREAM_MODE_PCM;
    mmi_config.downstream_mode = C_MMI_STREAM_MODE_MP3;
    // Configure the buffer size.
    mmi_config.recorder_rb_size = 8 * 1024;
    mmi_config.player_rb_size = 8 * 1024;

    c_mmi_config(&mmi_config);
    
    // Register the device.
    if (c_license_device_is_registered() == 0) {
        char time_ms_str[C_UTIL_TIMESTAMP_MS_LEN + 1];
        snprintf(time_ms_str, sizeof(time_ms_str), "%" PRId64, util_get_timestamp());
        // Generate the registration request string `req` based on the current timestamp_str (in milliseconds).
        c_device_gen_register_req(req, timestamp_str);
        // Obtain the device information returned by the server.
        dummy_http_request(METHOD, HOST, API, PORT, HEADER, req, dummy_http_response_for_register);
    }

    // Log on to the device.
    if (c_license_is_token_expire() == 0) {
        char time_ms_str[C_UTIL_TIMESTAMP_MS_LEN + 1];
        char api_key[C_MMI_API_KEY_LEN + 1] = { 0 };
        int32_t ret;

        snprintf(time_ms_str, sizeof(time_ms_str), "%" PRId64, util_get_timestamp());
        // Generate the logon request string `req` based on the current timestamp_str (in milliseconds).
        ret = c_mmi_storage_get_api_key(api_key);
        if (ret == UTIL_SUCCESS) {
            c_license_gen_get_token_str(req_str, sizeof(req_str), time_ms_str, api_key);
        } else {
            c_license_gen_get_token_str(req_str, sizeof(req_str), time_ms_str, NULL);
        }
        // Obtain the logon information returned by the server.
        dummy_http_request(METHOD, HOST, API, PORT, HEADER, req_str, dummy_http_response_for_token);
    }

    //... Other code
}

int dummy_wss_init(void)
{ 
    // Establish a WebSocket connection and obtain the connection fields.
    char *wss_host = c_mmi_get_wss_host();
    char *wss_port = c_mmi_get_wss_port();
    char *wss_api = c_mmi_get_wss_api();
    char *wss_header = c_mmi_get_wss_header();
    
    // Establish a WSS connection.
    int ret = dummy_wss_connect(wss_host, wss_port, wss_api, wss_header);
    
    return ret;
}

void dummy_recorder_task(void)
{
    uint32_t send_size =0;
    uint32_t size = 640;
    uint8_t* data = (uint8_t*) util_malloc(size);

    while(1){
        if(dummy_recoder_is_work()){
            send_size = dummy_hw_recorder_get_data(data, size);
            if(send_size){
                // Output the data collected by the audio capture hardware to the SDK ring buffer.
                c_mmi_put_recorder_data(data, send_size);
            }
            else{
                util_msleep(10);
            }
        
        } else {
            util_msleep(10);
        }
    }
}

void dummy_player_task(void)
{
    uint8_t data[640];
    uint32_t size = 640;
    uint32_t recv_size = 0;

    while(1) {
        if(dummy_player_is_work()){
            recv_size = c_mmi_get_player_data(data, size);
            if(recv_size){
                // Output the audio data from the SDK ring buffer to the player for playback.
                dummy_hw_player_put_data(data, size)
            } else {
                util_msleep(10);
            }
        
        } else {
            util_msleep(10);
        }
    }
}

int dummy_wss_task_recv(void)
{
    int opcode;
    char* payload_data;
    int recv_size;
    while(1){
        dummy_wss_recv(&opcode, &pauload_data, &recv_size);
        if(recv_size)
            // Send the received WebSocket data to the SDK for parsing. You only need to send the opcode and payload data.
            c_mmi_analyze_recv_data(opcode, payload_data, recv_size);
        else
            util_msleep(10);
    }
    return 0;
}

int dummy_wss_task_send(void)
{
    uint8_t data_type;
    uint8_t payload_data[1024];
    uint32_t size;

    while(1){
        // Obtain the packaged payload data from the SDK.
        size = c_mmi_get_send_data(&data_type, payload_data, size);
        if (size == 0) {
            util_msleep(10);
        } else {
            // Pass the payload data to the WebSocket send function and package the frame header for sending.
            dummy_wss_send(data_type, data, size);
        }
    }
    return 0;
}

int dummy_button_up(void)
{
    // Turn off the microphone.
    recorder_stop();
    // Notify the cloud service to start processing audio data.
    c_mmi_stop_speech();

    return 0;
}

int dummy_button_down(void)
{
    // Turn off the speaker.
    player_stop();
    // Notify the cloud that audio data will be sent soon. The SDK triggers C_MMI_EVENT_SPEECH_START based on the cloud's response to this instruction.
    c_mmi_start_speech();

    return 0;
}

int _mmi_event_callback(uint32_t event, void *param)
{
    char *text;

    text = param;
    switch (event) {
        case C_MMI_EVENT_USER_CONFIG:
            // Start a new conversation.
            c_mmi_reset_dialog_id();
            break;
        case C_MMI_EVENT_DATA_INIT:
            // MMI data is ready. Start the network connection.
            dummy_wss_init();
            break;
        case C_MMI_EVENT_DATA_DEINIT:
            UTIL_LOG_W("will disconnect");
            break;
        case C_MMI_EVENT_SPEECH_PREPARE:
            break;
        case C_MMI_EVENT_SPEECH_START:
            UTIL_LOG_D("enable recorder when send speech");
            dummy_player_stop();
            dummy_recorder_start();
            break;
        case C_MMI_EVENT_ASR_START:
            UTIL_LOG_I("event [C_MMI_EVENT_ASR_START]");
            break;
        case C_MMI_EVENT_ASR_INCOMPLETE:
            UTIL_LOG_D("ASR [%s]", text);
            break;
        case C_MMI_EVENT_ASR_COMPLETE:
            if (text) {
                UTIL_LOG_D("ASR C [%s]", text);
            } else {
                UTIL_LOG_D("ASR C [NULL]");
            }
            break;
        case C_MMI_EVENT_ASR_END:
            UTIL_LOG_D("disable record when ASR complete");
            dummy_recorder_stop();
            break;
        case C_MMI_EVENT_LLM_INCOMPLETE:
            UTIL_LOG_D("LLM [%s]", text);
            break;
        case C_MMI_EVENT_LLM_COMPLETE:
            UTIL_LOG_D("LLM C [%s]", text);
            break;
        case C_MMI_EVENT_TTS_START:
            UTIL_LOG_I("enable player when dialog start");
            dummy_player_start();
            break;
        case C_MMI_EVENT_TTS_END:
            break;
        default:
            break;
    }

    return UTIL_SUCCESS;
}

int main(void)
{
    int ret = dummy_aliyun_sdk_init();
    
    // Start the WebSocket sending and receiving threads. You must implement this yourself.
    dummy_wss_thread_start();
    return ret;
}

6.2. Complete log example

The following log shows a complete example:

[UT][I][c_storage_init]sdk ver       [0x00000300]
[UT][I][c_storage_init]flag          [0x0000003f]
[UT][I][c_storage_init]app_id        [<APP ID>]
[UT][I][c_storage_init]app_secret    [<APP SECRET>]
[UT][I][c_storage_init]device_name   [<DEVICE NAME>]
[UT][I][c_storage_init]nonce         [<NONCE>]
[UT][I][c_storage_init]dialog_id     [<DIALOG ID>]
[UT][I][c_storage_init]load          [<DATA>]
[UT][I][c_storage_init]device_secret [<DEVICE SECRET>]
[UT][I][c_mmi_register_event_callback]register event callback [<POINT ADDRESS>]
[UT][I][c_mmi_set_work_mode]work_mode[push2talk]
[UT][D][c_mmi_set_text_mode]text_mode[]
[UT][D][c_mmi_set_voice_id]voice_id[longxiaochun_v2]
[UT][D][c_mmi_set_upstream_mode]upstream_mode[pcm]
[UT][D][c_mmi_set_downstream_mode]downstream_mode[pcm]
[UT][I][c_mmi_config]device_name [<DEVICE NAME>]
[UT][I][c_mmi_config]load dialog_id [<DIALOG ID>]
[UT][I][c_mmi_config]done
[UT][I][c_device_gen_register_req]req_str [383][{"appId":"<YOUR APPID>","deviceName":"<YOUR DEVICE NAME>","nonce":"<YOUR NONCE>","requestTime":"1753326620619","sdkVersion":"0.3.2","signature":"<Signature>"}]
[UT][I][c_device_analyze_register_rsp]rsp_str [403][{"nonce":"<YOUR NONCE>","responseTime":"1753326621269","appId":"<YOUR APPID>","deviceName":"<YOUR DEVICE NAME>","signature":"<Signature>"}]
[UT][I][c_device_analyze_register_rsp]nonce  [<NONCE>]
[UT][I][c_dev_gen_get_token_req]plaintext [164][{"appId":"<YOUR APP ID>","deviceName":"<YOUR DEVICE NAME>","payMode":"LICENSE","requestTime":"1753327457730","sdkVersion":"0.3.2","tokenType":"MMI"}]
[UT][I][c_dev_gen_get_token_req]req_str [420][{"appId":"<YOUR APP ID>","deviceName":"<YOUR DEVICE NAME>","nonce":"<NONCE>","requestTime":"1753327457730","sdkVersion":"0.3.2","tokenType":"MMI","signature":"<SIGNTURE>"}]
[UT][I][c_mmi_analyze_get_token_rsp]rsp_str [589][{"nonce":"<NONCE>","responseTime":"1753327458081","appId":"<YOUR APP ID>","deviceName":"<YOUR DEVICE NAME>","requestIp":"YOUR IP","signature":"<SIGNATURE>"}]
[UT][I][c_mmi_analyze_get_token_rsp]nonce    [<NONCE>]
[UT][I][_mmi_event_callback]C_MMI_EVENT_DATA_INIT
[UT][I][dummy_wss_init]work
[UT][I][dummy_wss_connect]wss update
[UT][I][dummy_wss_connect]request[239][GET <wss_api> HTTP/1.1
Host: <WSS HOST>
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: FhVlQeR4S1N06+1/SU79XA==
Sec-WebSocket-Version: 13
<WSS HEADER>

]
[UT][I][dummy_wss_connect]Reading response...
[UT][I][dummy_wss_connect]response[MMI][187][HTTP/1.1 101 Switching Protocols
upgrade: websocket
connection: upgrade
sec-websocket-accept: sqchBdVDX8kKBgi90/PFl5+/4VI=
date: Thu, 24 Jul 2025 08:25:24 GMT
server: istio-envoy

]
[UT][I][dummy_wss_connect][MMI]done
[UT][I][dummy_wss_thread_start][dummy_wss_task_send] send task [0x135e27180]
[UT][I][dummy_wss_thread_start][dummy_wss_task_recv] recv task [0x135e271a0]
[UT][I][_gen_cmd_start]task_id [<TASK ID>]
[UT][I][_send_cmd_start]send [run-task] [0-2] [0]
[UT][D][c_mmi_analyze_recv_data]recv[109][{"header":{"task_id":"<TASK ID>","event":"task-started","attributes":{}},"payload":{}}]
[UT][I][c_mmi_analyze_recv_data]recv [task-started] [0-2]
[UT][D][c_mmi_analyze_recv_data]recv[192][{"header":{"task_id":"<TASK ID>","event":"result-generated","attributes":{}},"payload":{"output":{"event":"Started","dialog_id":"<DIALOG_ID>"}}}]
[UT][I][_on_payload_event_start]recv [Started] [0-3]
[UT][D][c_mmi_analyze_recv_data]recv[223][{"header":{"task_id":"<TASK ID>","event":"result-generated","attributes":{}},"payload":{"output":{"event":"DialogStateChanged","state":"Listening","dialog_id":"<DIALOG_ID>"}}}]
[UT][I][_on_payload_event_state_change]recv [Listening] [0-4]
[UT][D][dummy_button_down]
[UT][I][dummy_player_stop]
[UT][I][_send_cmd_req2spk]ready to send [1-4]
[UT][I][_send_cmd_speech]send [SendSpeech] [1-5] [0]
[UT][D][_mmi_event_callback]enable recorder when send speech
[UT][I][dummy_recorder_start]
[UT][I][_on_payload_event_speech_start]recv [SpeechStarted][ASR Start] [1-5]
[UT][I][_mmi_event_callback]event [C_MMI_EVENT_ASR_START]
[UT][D][dummy_button_up]
[UT][I][dummy_recorder_stop]
[UT][I][_send_cmd_stop_speech]send [StopSpeech] [1-5] [0]
[UT][I][_on_payload_event_speech_content]recv [SpeechContent][ASR Text] [1-5]
[UT][D][_mmi_event_callback]ASR C [What's the weather like today?]
[UT][I][_on_payload_event_speech_end]recv [SpeechEnded][ASR End] [1-6]
[UT][D][_mmi_event_callback]disable record when ASR complete
[UT][I][dummy_recorder_stop]
[UT][I][_on_payload_event_state_change]recv [Thinking] [1-7]
[UT][D][_on_payload_event_state_change]prepare player rb
[UT][W][c_mmi_analyze_recv_data]recv [8000] in thinking
[UT][W][c_mmi_analyze_recv_data]recv [1788] in thinking
[UT][I][_on_payload_event_state_change]recv [Responding][Audio Start] [1-8]
[UT][I][_mmi_event_callback]enable player when dialog start
[UT][I][dummy_player_start]
[UT][I][_on_payload_event_respond_start]recv [RespondingStarted][Audio Start] [1-8]
[UT][I][_on_payload_event_respond_content]recv [RespondingContent][LLM Text] [1-8]
[UT][D][_mmi_event_callback]LLM C [Today in Shanghai, it is cloudy with a high of 33°C. Expect light rain tonight with a low of 27°C. The wind is from the east at Force 1 to 3.]
[UT][I][_on_payload_event_respond_end]recv [RespondingEnded][Audio End] [1-9]
[UT][D][_on_payload_event_respond_end]recv audio data size [371200]