Integration overview

更新时间:
复制 MD 格式

This topic describes how to integrate an audio and video intelligent agent into your Android app using the AICallKit SDK.

Prerequisites

  • Android Studio 4.1.3 or later

  • Gradle 7.0.2 or later

  • JDK 11, which is bundled with Android Studio

Workflow

image

Your app obtains an RTC Token from your App Server. You can then call call(rtcToken) to start a call. During the call, you can use the AICallKit API to implement interactive features such as subtitles and interrupting the agent. AICallKit relies on real-time audio and video capabilities and includes the functionality of the AliVCSDK_ARTC SDK. If your business scenario also requires live streaming and on-demand video, you can use ApsaraVideo for MediaBox, such as AliVCSDK_Standard or AliVCSDK_InteractiveLive. To learn how to combine the SDKs, see Select and download an SDK.

Integrate the SDK

  1. Add the Alibaba Cloud Maven repository to your project-level build.gradle file.

    allprojects {
        repositories {
            google()
            jcenter()
            maven { url 'https://maven.aliyun.com/repository/google' }
            maven { url 'https://maven.aliyun.com/repository/public' }
        }
    }
  2. In your app-level build.gradle file, add the ARTCAICallKit dependency.

    dependencies {
        implementation 'com.aliyun.aio:AliVCSDK_ARTC:x.x.x'                  // Replace x.x.x with a version that is compatible with your project.
        implementation 'com.aliyun.auikits.android:ARTCAICallKit:x.x.x'
        implementation 'com.alivc.live.component:PluginAEC:2.0.0'
    }
    Note
    • Latest ARTC SDK version: 7.10.0

    • Latest AICallKit SDK version: 2.11.0.

SDK development

Step 1: Request audio and video permissions

Check if your app has microphone and camera permissions. If not, prompt the user to grant them. You must implement this permission request logic in your app. For a code example, see PermissionUtils.java.

PermissionX.init(this)
.permissions(PermissionUtils.getPermissions())
.request((allGranted, grantedList, deniedList) -> {
});

Step 2: Create and initialize the engine

Create and initialize the ARTCAICallEngine. The following code shows an example:

String userId = "123";  // We recommend using the user ID from your app's login.
ARTCAICallEngineImpl engine = new ARTCAICallEngineImpl(this, userId);

 // If the agent is a digital human (AvatarAgent), configure its view container.
if (aiAgentType == AvatarAgent) {
    ViewGroup avatarlayer;
    engine.setAgentView(
        avatarlayer,
        new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                                   ViewGroup.LayoutParams.MATCH_PARENT)
    );
}
// If the agent provides visual understanding (VisionAgent), configure the local video preview container.
else if (aiAgentType == VisionAgent) {
    ViewGroup previewLayer;
    engine.setLocalView(previewLayer,
        new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                                     ViewGroup.LayoutParams.MATCH_PARENT)
    );
} else if(aiAgentType == VideoAgent) {
    ARTCAICallEngine.ARTCAICallVideoCanvas remoteCanvas = new ARTCAICallEngine.ARTCAICallVideoCanvas();
            remoteCanvas.zOrderOnTop = false;
            remoteCanvas.zOrderMediaOverlay = false;
    ViewGroup avatarlayer;
    engine.setAgentView(
        avatarlayer,
        new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                                   ViewGroup.LayoutParams.MATCH_PARENT), remoteCanvas
    );

    ViewGroup previewLayer;
    engine.setLocalView(previewLayer,
        new FrameLayout.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
                                     ViewGroup.LayoutParams.MATCH_PARENT)
    );

}

Step 3: Implement callbacks

Implement the necessary engine callbacks to handle various events. For details on the callback interface, see the API reference.

protected ARTCAICallEngine.IARTCAICallEngineCallback mCallEngineCallback = new ARTCAICallEngine.IARTCAICallEngineCallback() {
    @Override
    public void onErrorOccurs(ARTCAICallEngine.AICallErrorCode errorCode) {
        // An error occurred. End the call.
        engine.handup();
    }

    @Override
    public void onCallBegin() {
        // The call starts (the user has joined the session).
    }

    @Override
    public void onCallEnd() {
        // The call ends (the user has left the session).
    }

    @Override
    public void onAICallEngineRobotStateChanged(ARTCAICallEngine.ARTCAICallRobotState oldRobotState, ARTCAICallEngine.ARTCAICallRobotState newRobotState) {
        // The agent's state changes.
    }

    @Override
    public void onUserSpeaking(boolean isSpeaking) {
        // A user is speaking.
    }

    @Override
    public void onUserAsrSubtitleNotify(String text, boolean isSentenceEnd, int sentenceId, VoicePrintStatusCode voicePrintStatusCode) {
    
    }

    @Override
    public void onAIAgentSubtitleNotify(String text, boolean end, int userAsrSentenceId) {
        // Received subtitles for the agent's response.
    }

    @Override
    public void onNetworkStatusChanged(String uid, ARTCAICallEngine.ARTCAICallNetworkQuality quality) {
        // The network status changes.
    }

    @Override
    public void onVoiceVolumeChanged(String uid, int volume) {
        // A user's voice volume changes.
    }

    @Override
    public void onVoiceIdChanged(String voiceId) {
        // The voice for the current call changes.
    }

    @Override
    public void onVoiceInterrupted(boolean enable) {
        // The voice interruption setting for the current call changes.
    }

    @Override
    public void onAgentVideoAvailable(boolean available) {
        // The agent's video is available (publishing).
    }

    @Override
    public void onAgentAudioAvailable(boolean available) {
        // The agent's audio is available (publishing).
    }

    @Override
    public void onAgentAvatarFirstFrameDrawn() {
        // The first video frame of the digital human is rendered.
    }

    @Override
    public void onUserOnLine(String uid) {
        // A remote user comes online.
    }

};

engine.setEngineCallback(mCallEngineCallback);

Step 4: Create and initialize ARTCAICallConfig

For details, see the ARTCAICallConfig reference.

ARTCAICallEngine.ARTCAICallConfig artcaiCallConfig = new ARTCAICallEngine.ARTCAICallConfig();
artcaiCallConfig.agentId = "XXX";            // Required. The agent ID.
artcaiCallConfig.region = "cn-shanghai";     // Required. The agent region.
artcaiCallConfig.agentType = VoiceAgent;     // Specify the agent type: voice-only, digital human, visual understanding, or video call.
engine.init(artcaiCallConfig);

Region name

Region ID

China (Hangzhou)

cn-hangzhou

China (Shanghai)

cn-shanghai

China (Beijing)

cn-beijing

China (Shenzhen)

cn-shenzhen

Singapore

ap-southeast-1

Step 5: Start a call with the agent

Call call() to start a call with the agent. To obtain an authentication token, see Generate an ARTC authentication token. After the call starts, you can implement in-call features such as subtitles and interrupting the agent. For more information, see Implement features.

engine.call(token);

// After the call is connected, the following callback is triggered.
public void onCallBegin() {
        // The call starts (the user has joined the session).
}

Step 6: Implement in-call features

During the call, you can implement features such as subtitles and interrupting the agent. For more information, see Implement features.

Step 7: End the call

Call handup() to end the call.

engine.handup();