Integrate AICallKit Web SDK

更新时间:
复制 MD 格式

Integrate the AICallKit SDK into your web application to enable AI-powered voice and video agent calls.

Environment requirements

  • Node.js 18.0 or later (latest LTS recommended).

  • Webpack 5 or later, or Vite.

Workflow

image

Your app uses an AppServer (your business server) to generate an ARTC authentication token, then calls callWithConfig(config) to start a call. During the call, the AICallKit API provides interactive features such as real-time captions and interruptions. AICallKit internally uses the AliVCSDK_ARTC SDK for real-time audio and video.

Integrate the SDK

npm install aliyun-auikit-aicall --save

SDK development guide

Step 1: Create and initialize the engine

Create and initialize an ARTCAICallEngine instance:

// Specify the agent type: audio-only, digital human, or visual understanding.
let agentType: AICallAgentType;

// Initialization parameters.
const config = {
  muteMicrophone: false, // Specifies whether to mute the microphone. Set as needed.
  // Other parameters are omitted.
}

// If the agent type is visual understanding, configure the local video preview.
if (agentType === AICallAgentType.VisionAgent) {
  config.previewElement = 'example-preview-element-id';
  config.cameraConfig = {
    width: 450,
    height: 800,
    frameRate: 5,
  };
}

const engine = new ARTCAICallEngine(config);

// If the agent type is digital human, configure the view for displaying the digital human.
if (agentType === AICallAgentType.AvatarAgent) {
  engine.setAgentView('example-agent-element-id');
}

Step 2: Implement callback methods

Implement the engine callback methods as needed. For more information about the engine callback API, see ARTCAICallEngine event details.

// This callback is triggered if the call fails to start.
engine.on('errorOccurred', (code: number) => {
  // An error occurred during the process. Hang up the call.
  engine.hangup();
});

// This callback is triggered if the agent starts the call successfully.
engine.on('agentStarted', () => {
  // The agent instance has started.
});

// This callback is triggered when the current call is connected.
engine.on('callBegin', () => {
  // The call starts.
});

engine.on('callEnd', () => {
  // The call ends.
});

engine.on('agentStateChange', (newState: AICallAgentState) => {
  // The agent state changes.
});

engine.on('agentSubtitleNotify', (subtitle: AICallSubtitleData) => {
  // Notification for the agent's response result.
});

engine.on('userSubtitleNotify', (subtitle: AICallSubtitleData, voiceprintResult: AICallVoiceprintResult) => {
  // Notification for the user's speech recognition result.
});

engine.on('voiceIdChanged', (voiceId: string) => {
  // The voice timbre for the current call has changed.
});

engine.on('voiceInterruptChanged', (enable: boolean) => {
  // The status of the smart voice interruption for the current call has changed.
});

Step 3: Create and initialize AICallConfig

For more information about the structure, see AICallConfig.

const callConfig = {
  agentId: "xxx",             // The agent ID.
  agentType: agentType,       // The agent type.
  userId: "xxx",              // Use the user ID from your app logon.
  region: "xx-xxx",           // The region where the agent service is located.
  userJoinToken: "xxxxxxxxx", // The RTC token.
};

Region Name

Region ID

China (Hangzhou)

cn-hangzhou

China (Shanghai)

cn-shanghai

China (Beijing)

cn-beijing

China (Shenzhen)

cn-shenzhen

Singapore

ap-southeast-1

To set userJoinToken, obtain an RTC token.

Step 4: Initiate an agent call

Call callWithConfig(config) to start the agent call:

// After the agent starts, begin the call.
engine.callWithConfig(callConfig);

// This callback is triggered when the current call is connected.
engine.on('callBegin', () => {
  // The call starts.
});

Step 5: Implement in-call features

After the call connects, add features such as captions and agent interruption. Feature implementation.

Step 6: End the call

Call hangup() to end the call:

engine.hangup();