During a voice call between a user and an AI agent, you can hand off the conversation to a human agent for further assistance.
What is human takeover?
Human takeover seamlessly transfers a conversation from an AI agent to a human agent when the AI agent cannot resolve a user's issue. The handoff can be triggered by predefined rules, a user request, or LLM-based intent detection.
Limitations
-
To use the human takeover feature, you must submit a ticket with your workflow ID. The feature will then be enabled for your workflow.
-
After a human agent takes over a conversation, you cannot switch it back to the AI agent.
How it works
In this workflow, focus on three key stages: triggering the human takeover, notifying the human agent, and the human agent joining the conversation.
-
Trigger human takeover (Step 5):
-
When a user requests a human takeover, implement the notification logic for the human agent.
-
To enable a large language model (LLM) to determine when a human takeover is needed, create a custom plugin in Alibaba Cloud Model Studio:
-
-
This step notifies the web client to initiate the human takeover. Implement the notification mechanism that connects the LLM, your App Server, and the web client.
-
After the human agent takes over, implement a method for them to receive the transcript from the AI agent. For details, see Join the channel from the agent-side web client.
Implement human takeover
Web client takeover workflow
Joining the channel (agent client)
-
The agent client calls the Enable human takeover mode API.
-
The agent client obtains information for joining the channel, including an RTC token.
-
The agent client uses the channel information to join the RTC channel, publish and subscribe to streams, and receive the transcript from the AI agent. For more information about RTC integration, see Get started with ARTC SDK for Web.
After the human takeover, you can choose to use the AI agent's voice or the human agent's own voice. To ensure the voice remains consistent with the original AI agent, you must perform voice cloning beforehand. For details, see Personalized Voice.
The following sample code shows how to receive the transcript from the AI agent:
-
Listen for the transcript event.
aliRtcEngine.on('dataChannelMsg', (uid, msg) => { try { const dataChannelMsg = JSON.parse(new TextDecoder().decode(msg.data)); if (dataChannelMsg.type === 1002) { // Received transcript from the AI agent. Data format: // text: The transcribed text. // sentenceId: The ID of the sentence. Used to group multiple results for the same sentence in a stream. // end: Indicates if this is the end of the current sentence. console.log('agentSubtitleNotify', dataChannelMsg.data); } else if (dataChannelMsg.type === 1003) { // Received transcript from the user. // text: The transcribed text. // sentenceId: The ID of the sentence. Used to group multiple results for the same sentence in a stream. // end: Indicates if this is the end of the current sentence. console.log('userSubtitleNotify', dataChannelMsg.data); } else { // Other types can be ignored. } } catch(error) { console.error(error) } }); -
Before joining the channel, call the following method to receive transcripts.
aliRtcEngine.setParameter( JSON.stringify({ data: { enableSubDataChannel: true, }, }), );
Advanced implementation
On-demand subscription for human agents
By default, a human agent subscribes to all users in the channel, including the end user and the AI agent. This means the human agent receives the user's audio and video, as well as the speech-to-text transcript from the AI agent.
If you do not need the AI agent's audio or transcript, use on-demand subscription to subscribe only to the user's audio and video streams.
-
Before calling
joinChannel, call methods such assetDefaultSubscribeAllRemoteAudioStreamsto disable automatic subscription.// Disable automatic subscription to audio streams. aliRtcEngine.setDefaultSubscribeAllRemoteAudioStreams(false); // Disable automatic subscription to video and screen sharing streams. aliRtcEngine.setDefaultSubscribeAllRemoteVideoStreams(false); -
After disabling automatic subscription, use the
subscribeRemoteMediaStreammethod to subscribe to a specific user.// Import the AliRtcVideoTrack enum by using npm. import { AliRtcVideoTrack } from 'aliyun-rtc-sdk'; // Alternatively, import it from the window object. // const AliRtcVideoTrack = window.AliRtcEngine.AliRtcVideoTrack; // The ID of the user to subscribe to. const subscribeUserId = 'xxxx'; // This example subscribes to a user by listening for the remoteUserOnLineNotify event. // In a real-world scenario, you might decide whether to subscribe based on your business logic. aliRtcEngine.on('remoteUserOnLineNotify', (userId) => { if (userId === subscribeUserId) { // Subscribe to all streams (video, audio, screen) of this user. No error occurs if the user is not publishing certain streams. // If the user starts publishing a new stream later, it will be subscribed to automatically. aliRtcEngine.subscribeRemoteMediaStream( userId, AliRtcVideoTrack.AliRtcVideoTrackBoth, true, true ); } }); // Handle cases where the user is already in the channel. if (aliRtcEngine.getOnlineRemoteUsers()?.includes(subscribeUserId)) { // Subscribe to all streams (video, audio, screen) of this user. No error occurs if the user is not publishing certain streams. // If the user starts publishing a new stream later, it will be subscribed to automatically. aliRtcEngine.subscribeRemoteMediaStream( userId, AliRtcVideoTrack.AliRtcVideoTrackBoth, true, true ); }
Inform users about the takeover process
Use callback interfaces to monitor the human agent's status during the takeover and provide real-time feedback to the user.
Implement the onHumanTakeoverWillStart and onHumanTakeoverConnected callback interfaces to monitor the takeover process and notify users. The following is sample code:
The client must use AICallKit SDK v1.5.0 or later.
Android
// Callback handling (example of core callback operations)
ARTCAICallEngine.IARTCAICallEngineCallback mCallEngineCallbackWrapper = new ARTCAICallEngine.IARTCAICallEngineCallback() {
@Override
public void onHumanTakeoverWillStart(String takeoverUid, int takeoverMode) {
// A human agent is about to take over the current AI agent.
}
@Override
public void onHumanTakeoverConnected(String takeoverUid) {
// The human agent has successfully connected.
}
// Other callbacks.
...
};
iOS
extension AUIAICallStandardController: ARTCAICallEngineDelegate {
public func onHumanTakeoverWillStart(takeoverUid: String, takeoverMode: Int) {
// A human agent is about to take over the current AI agent.
debugPrint("AUIAICallStandardController onHumanTakeoverWillStart:\(takeoverUid) , takeoverMode:\(takeoverMode)")
}
public func onHumanTakeoverConnected(takeoverUid: String) {
// The human agent has successfully connected.
debugPrint("AUIAICallStandardController onHumanTakeoverConnected:\(takeoverUid)")
}
// Other callbacks.
...
}
Web
engine.on('humanTakeoverWillStart', (uid: string, mode: number) => {
// A human agent is about to take over the current AI agent.
console.log('AICallHumanTakeoverWillStart', uid, mode);
});
engine.on('humanTakeoverConnected', (uid: string) => {
// The human agent has successfully connected.
console.log('AICallHumanTakeoverConnected', uid);
});