This guide explains how to build and deploy Real-time Conversational AI applications on your smart hardware.
Overview
Real-time Conversational AI provides a comprehensive solution for integrating high-quality, low-latency AI agents into smart hardware such as wearables, companion robots, and smart home devices. This solution enables developers to quickly validate and deploy conversational AI, making always-on AI a reality. Common use cases include AI toys, educational hardware, companion devices, wearable personal assistants, and home voice assistants.
Before you begin
Support is currently provided for four chip models:
Jieli AC791,Espressif ESP32-S3/ESP32-P4, andBeken BK7258. For support for additional chips, submit a ticket.Activate Real-time Conversational AI and create an agent. For more information, see Quick start for audio/video calls.
Apply for a smart hardware license. For details, contact your business manager.
Procedure
Espressif ESP32-S3/ESP32-P4
How it works
Download SDK and demo source code
Download the source code from the GitHub open-source project.
Environment configuration
To set up the ESP-ADF development environment, see the Espressif Development Board Guide.
Run the demo
Set up the development environment by following the official Espressif documentation. This demo requires 3A audio processing. To prevent an "AUDIO_THREAD: Not found right xTaskCreateRestrictedPinnedToCore" error, apply a patch to the
esp-idfdirectory.cd esp-adf/esp-idf # The esp-idf directory git apply ../idf_patches/idf_v5.4_freertos.patchClone the source code for this demo to your local machine, open the
Kconfig.projbuildproject build configuration file, and fill in the Wi-Fi, agent information, and License.menu "Example Configuration" config WIFI_SSID string "WiFi SSID" default "xxx" help SSID (network name) for the example to connect to. config WIFI_PASSWORD string "WiFi Password" default "xxx" help WiFi password (WPA or WPA2) for the example to use. Can be left blank if the network has no security set. config AUDIO_PLAY_VOLUME int "Audio play volume" default 90 config AUDIO_RECORDER_AEC_ENABLE bool default y config RTC_APP_ID string "ARTC AppID associated with the agent (Warning: For development and testing only)" default "xxx" config RTC_APP_KEY string "ARTC AppKey associated with the agent (Warning: For development and testing only)" default "xxx" config RTC_USER_ID string "User ID (A unique ID is recommended for each device)" default "123" config VOICE_AGENT_ID string "Voice agent ID (Create an agent in the console in advance)" default "xxx" config VOICE_AGENT_REGION string "Region of the voice agent" default "cn-xxx" config LICENSE_PRODUCT_ID string "Enter the license product ID. Contact your business manager to obtain it." default "xxx" config LICENSE_AUTH_CODE string "Enter the license authorization code. Contact your business manager to obtain it." default "xxx" config LICENSE_DEVICE_ID string "Device serial number" default "xxx" endmenuImportantFor information about the agent ID, region, AppId, and AppKey, see Quick start for audio/video calls.
The
RTC_APP_KEYis used to generate a token locally. This method is intended for development and testing only. For production environments, do not embed the AppKey in your application. Instead, generate the token on your server and send it to the device.To generate an authentication token, see Generate an ARTC authentication token. For smart hardware, you do not need to Base64-encode the token. Send the raw JSON result directly to the device.
To build and flash the demo project, see the ESP-IDF Programming Guide.
Run the demo. The monitor window displays logs for initialization, key event listener setup, and Wi-Fi connection. When the message "Main: [ 5 ] Initialize finish, listen to events" appears, initialization is complete. If you are using an ESP32-S3-BOX-3B development board, press the Play button (the top-left button) to start a call. The ESP32-P4-Function-EV-Board does not have physical buttons and automatically starts the call after initialization.
I (1309) sleep_gpio: Configure to isolate all GPIO pins in sleep state I (1315) sleep_gpio: Enable automatic switching of GPIO sleep configuration I (1322) main_task: Started on CPU0 I (1325) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations I (1333) main_task: Calling app_main() I (1336) Main: [ 1 ] Initialize start I (1355) Main: [1.1] Initialize peripherals I (1356) Main: [ 2 ] Start and wait for Wi-Fi network W (1368) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2 W (1402) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43 W (1410) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43 W (1418) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms W (4825) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms W (5894) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4 I (7426) Main: [2.1] Initializing SNTP I (9042) Main: [ 3 ] Start codec chip W (9042) i2c_bus_v2: I2C master handle is NULL, will create new one W (9082) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2 I (9094) Main: [ 4 ] Set up event listener I (9095) Main: [ 5 ] Initialize finish, listen to events
Troubleshooting
The certificate expired error shown below typically occurs because an unstable Wi-Fi connection prevents successful NTP time synchronization. As a result, the SDK defaults to the epoch time for certificate validation, causing the check to fail.
[1970-01-01 00:00:35.715][ERROR] [license.c:276][hermes_sdk] license response code is 400, dataSize: 321
[1970-01-01 00:00:35.717][ERROR] [license.c:300][hermes_sdk] license refresh fail with code=-3, svcCode=InvalidTimeStamp. Expired, regeustId=c401D243-E831-5AD9-A756-283A94D314FF
[1970-01-01 00:00:35.730][ERROR] [license.c:375][hermes_sdk] parse license response failed with result: -3
[1970-01-01 00:00:35.739][ERROR] [artc _aicall.c:79][hermes_sdk] raise error event code: -10001 msg: action return: -2147481342To resolve this, ensure your device has a stable Wi-Fi connection, for example, by connecting to a different hotspot. This demo uses simplified Wi-Fi connection logic for demonstration purposes. For production environments, implement a robust mechanism that handles disconnections and re-synchronizes the time after reconnecting.
Beken BK7258
How it works
Download SDK and demo source code
Download the source code from the GitHub open-source project.
Environment configuration
To set up the AVDK development environment, see the Beken ARMINO AVDK Development Framework documentation.
Run the demo
Clone the demo source code to bk_avdk/projects, rename the source project directory to
amdemos-bk7258-artc, then open the project build configuration fileKconfigand enter the WiFi, agent, and license information.config WIFI_SSID string "WiFi SSID" default "xxx" config WIFI_PASSWORD string "WiFi Password" default "xxx" config ARTC_AICALL_APP_ID string "ARTC AppID associated with the agent (Warning: For development and testing only)" default "xxx" config ARTC_AICALL_APP_KEY string "ARTC AppKey associated with the agent (Warning: For development and testing only)" default "xxx" config ARTC_AICALL_AGENT_ID string "Voice agent ID (Create an agent in the console in advance)" default "xxx" config ARTC_AICALL_AGENT_REGION string "Region of the voice agent" default "xxx" config ARTC_AICALL_LICENSE_PRODUCT_ID string "Enter the license product ID. Contact your business manager to obtain it." default "xxx" config ARTC_AICALL_LICENSE_AUTH_CODE string "Enter the license authorization code. Contact your business manager to obtain it." default "xxx" config ARTC_AICALL_LICENSE_DEVICE_ID string "Device serial number" default "xxx"ImportantFor information about the agent ID, region, AppId, and AppKey, see Quick start for audio/video calls.
The
RTC_APP_KEYis used to generate a token locally. This method is intended for development and testing only. For production environments, do not embed the AppKey in your application. Instead, generate the token on your server and send it to the device.To generate an authentication token, see Generate an ARTC authentication token. For smart hardware, you do not need to Base64-encode the token. Send the raw JSON result directly to the device.
Compile the demo project with the command
make bk7258 PROJECT=amdemos-bk7258-artc. For flashing, see Flash the code.Run the demo. The serial port will print logs for device initialization and the Wi-Fi connection. When the "cli:W(364):BK STA got ip" log is printed, it indicates that initialization and networking are complete. You can then enter
artc_aicall start user_idthrough the serial port to start a call. To stop the call, enterartc_aicall stop.[16:00:18] ap:W(106):media_app_mailbox_message_handle media_ap:W(106):mailbox app thread startup complete [16:00:18] [400] [2208] [2001] [2005] [2200] [2202] [2003] [2205] [2006] [2207] [16:00:18] cli:W(324):BK STA connected xxx [16:00:18] cli:W(364):BK STA got ip [16:00:28] (10254):ntp_sync_to_rtc:cur_time=1772121628,frag=0,us=-1343613349 (10256):datetime_set_nano:sec=1772121628,frag=0,tv_u=-1343613 [16:00:28] 349,us=687165
Troubleshooting
Ensure your device has a stable Wi-Fi connection when running the demo. If you encounter issues, try connecting to a different hotspot. This demo uses a simplified Wi-Fi connection logic. For production environments, implement a robust mechanism that handles disconnections and re-synchronizes the time after reconnecting.
If you need to port to an existing project, make sure to migrate the following settings from
config:
CONFIG_ARTC_IOT_SDK=y // Support artc-iot-sdk
CONFIG_MEDIA=y // Support media service
CONFIG_WIFI_TRANSFER=y // Support Wi-Fi transfer encode frame
CONFIG_AUD_INTF=y // Support audio
CONFIG_FREERTOS_POSIX=y // Support FreeRTOS Posix API
CONFIG_NTP_SYNC_RTC=y // Support NTP sync RTCJieLi AC791
How it works
Download SDK and demo source code
Download the source code from the GitHub open-source project.
Environment configuration
To set up the development environment for the JieLi AC791 series development board, see the JieLi Development Environment Installation Guide.
Run the demo
Clone the Jieli SDK to your local machine.
Clone the source code for this demo into the
apps/demodirectory of Jieli, and rename the directory todemo_artc_aicall.- FW-AC79_AIOT_SDK - apps - demo - demo_artc_aicall (Note: The folder name uses underscores.)In the
demo_artc_aicalldirectory, use CodeBlocks IDE to open the project fileboard/wl82/AC791N_DEMO_ARTC_AICALL.cbp.Open the
artc_device_helper.cfile and modify the Wi-Fi SSID and PWD.// Set the Wi-Fi SSID and password for network connection. #define WIFI_STA_SSID "ssid" #define WIFI_STA_PWD "pwd"Open the
artc_aicall_demo.cfile and modify the License and agent information.// License information for the agent call #define LICENSE_PRODUCT_ID "xxx" // The product ID of the smart hardware license #define LICENSE_AUTH_CODE "xxx" // The authorization code of the smart hardware license #define LICENSE_DEVICE_ID "xxx" // The unique ID of the device // The ID of the user in the agent call #define USER_ID "xxxx" // The ID of the agent #define VOICE_AGENT_ID "xxxx" // The region where the agent is located #define AGENT_REGION "xxxx" // The RTC App ID associated with the agent #define RTC_APP_ID "xxxx" // The RTC App Key associated with the agent #define RTC_APP_KEY "xxxx"ImportantFor information about the agent ID, region, AppId, and AppKey, see Quick start for audio/video calls.
The
RTC_APP_KEYis used to generate a token locally. This method is intended for development and testing only. For production environments, do not embed the AppKey in your application. Instead, generate the token on your server and send it to the device.To generate an authentication token, see Generate an ARTC authentication token. For smart hardware, you do not need to Base64-encode the token. Send the raw JSON result directly to the device.
Build the demo project. For build instructions, see the official JieLi documentation.
Run the demo. After powering on the device, use the following keys to interact with the agent.
- `K1`: Start the agent call - `K2`: Hang up the call - `K3`: Interrupt the agent - `K4`: Stop sending audio to the agent - `K5`: Resume sending audio to the agentTo view real-time logs, connect the device's serial port to your computer and use a serial monitoring tool.