Best practices for voice announcements

更新时间:
复制 MD 格式

This topic provides best practices for voice announcement scenarios, such as payment confirmations and real-time activity notifications.

Prerequisites

Voice announcement methods and limits by operating system

System

Method

Limits

Android

Pass-through message + Text-to-Speech (TTS) synthesis

  • This method only works with Alibaba Cloud proprietary channels. It does not support third-party vendor channels.

  • This method works only when the device is online. If the device is offline, the announcement is delivered when the device reconnects. To ensure a good user experience, add a validity check and a playback interval for multiple messages that arrive at the same time.

Alibaba Cloud proprietary channel notification + TTS speech synthesis

iOS

Notification extension + audio splicing

  • You must embed basic audio files, such as "payment received", "0" to "9", "CNY", and "point", in the bundle.

  • You must use an App Group to share data.

Pass-through message + AVSpeechSynthesizer speech synthesis

  • This method works only when the device is online. If the device is offline, the announcement is delivered when the device reconnects. To ensure a good user experience, add a validity check and a playback interval for multiple messages that arrive at the same time.

Silent notification + AVSpeechSynthesizer speech synthesis (Not recommended)

  • Frequency limit: Apple recommends sending no more than two or three silent pushes per hour. Otherwise, pushes may be throttled.

  • Running time: After the app is woken up in the background, it has only 30 seconds to process the task. If the timeout period is exceeded, the task is stopped.

  • Unreliable delivery: The system prioritizes user-visible notifications. Silent pushes may be delayed or discarded.

Note

For more information about the limits of silent notifications, see the official documentation for silent notifications.

HarmonyOS

Notification extension + TTS speech synthesis

Android voice announcements

On Android, you can implement voice announcements by pushing notifications or messages through Alibaba Cloud proprietary channels. The client retrieves the text from the corresponding callback and uses the native Text-to-Speech (TTS) API to convert the text into speech for playback.

Method 1: Pass-through message + TTS speech synthesis

Server-side push parameter settings

When sending a pass-through message from the server, specify the Alibaba Cloud proprietary channel and pass through the voice announcement content:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setSendChannels("accs");
pushRequest.setPushType("MESSAGE");
pushRequest.setBody("${voice_announcement_content}");
...

Client-side implementation of voice announcements

After the client receives the pass-through message, you can intercept the message in the callback to retrieve the voice announcement content. Then, use a TTS engine for the voice announcement. You can use the native TTS API or a third-party TTS engine. The steps are as follows:

1. Encapsulate and initialize the TTS engine
  • First, encapsulate the TTS engine and provide an initialization method and a voice announcement method:

object TTSManager {

    private var mTextToSpeech: TextToSpeech? = null

    fun init(context: Context){
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
            mTextToSpeech = TextToSpeech(context) {
                if (it == TextToSpeech.SUCCESS) {
                    val languageCode = mTextToSpeech?.setLanguage(Locale.CHINESE)
                    if (languageCode == TextToSpeech.LANG_NOT_SUPPORTED || languageCode == TextToSpeech.LANG_MISSING_DATA) {
                        // The voice package is not installed or not supported.
                        mTextToSpeech?.language = Locale.US
                    }
                    mTextToSpeech?.setPitch(1.0f)
                    mTextToSpeech?.setSpeechRate(1.0f)
                }
            }
        }
    }

    fun speak(text: String) {
        if (mTextToSpeech?.isSpeaking == true) {
            mTextToSpeech?.stop()
        }
        mTextToSpeech?.speak(text, TextToSpeech.QUEUE_FLUSH, null, "")
    }

}
public class TTSManager {

    private TextToSpeech mTextToSpeech;
    private TTSManager(){}

    private static class SingletonHolder{
        private static final TTSManager INSTANCE = new TTSManager();
    }

    public static TTSManager getInstance(){
        return SingletonHolder.INSTANCE;
    }

    public void init(Context context) {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
            mTextToSpeech = new TextToSpeech(context, status -> {
                if (status == TextToSpeech.SUCCESS) {
                    int result = mTextToSpeech.setLanguage(Locale.CHINA);
                    if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                        mTextToSpeech.setLanguage(Locale.US);
                    }
                    mTextToSpeech.setPitch(1.0f);
                    mTextToSpeech.setSpeechRate(1.0f);
                }
            });
        }
    }

    public void speak(String text) {
        if (mTextToSpeech != null) {
            if (mTextToSpeech.isSpeaking()) {
                mTextToSpeech.stop();
            }
            mTextToSpeech.speak(text, TextToSpeech.QUEUE_FLUSH, null, "");
        }
    }

}
  • Call the TTS engine initialization method in the Application class:

class MyApplication : Application() {
    override fun onCreate() {
        super.onCreate()
        // Initialize the TTS engine.
        TTSManager.init(this)
    }
}
public class MyApplication extends Application {
    @Override
    public void onCreate() {
        super.onCreate();
        // Initialize the TTS engine.
        TTSManager.getInstance().init(this);
    }
}
  • Register the MyApplication class in the AndroidManifest.xml file:

<application
    android:name="MyApplication">
</application>
2. Make the voice announcement in the pass-through message callback

For more information about integrating MessageReceiver or AliyunMessageIntentService, see Message and Notification Processing Interfaces. For example, with MessageReceiver, you must obtain the voice announcement content from the pass-through message in the onMessage callback, and then call the voice announcement method of the TTS engine:

class MyMessageReceiver: MessageReceiver() {
    override fun onMessage(context: Context?, cPushMessage: CPushMessage?) {
        cPushMessage?.let {
            TTSManager.speak(it.content)
        }
    }
}
public class MyMessageReceiver extends MessageReceiver {
    @Override
    protected void onMessage(Context context, CPushMessage cPushMessage) {
        if (cPushMessage != null) {
            TTSManager.getInstance().speak(cPushMessage.getContent());
        }
    }
}

Method 2: Alibaba Cloud proprietary channel notification + TTS speech synthesis

Server-side push parameter settings

When sending a notification from the server, specify the Alibaba Cloud proprietary channel and use the AndroidExtParameters field to pass the voice announcement content:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setSendChannels("accs");
pushRequest.setPushType("NOTICE");
pushRequest.setAndroidExtParameters("{\"ttsContent\":\"${voice_announcement_content}\"}");
...

Client-side implementation of voice announcements

After the client receives the notification, you can intercept it in the callback to retrieve the voice announcement content. Then, use a TTS engine for the voice announcement. You can use the native TTS API or a third-party TTS engine. The steps are as follows:

1. Encapsulate and initialize the TTS engine

For the steps, see the 1. Encapsulate and initialize the TTS engine section in Method 1.

2. Make the voice announcement in the notification callback

For more information about integrating MessageReceiver or AliyunMessageIntentService, see Message and Notification Processing Interfaces. For example, with MessageReceiver, you must obtain the voice announcement content from the notification in the onNotification callback and then call the speech synthesis method of the TTS engine:

class MyMessageReceiver: MessageReceiver() {
    override fun onNotification(
        context: Context?,
        title: String?,
        content: String?,
        extra: MutableMap<String, String>?
    ) {
        extra?.apply {
            if (containsKey("ttsContent")) {
                val ttsContent = get("ttsContent")
                if (!TextUtils.isEmpty(ttsContent)) {
                    TTSManager.speak(ttsContent!!)
                }
            }
        }
    }
}
public class MyMessageReceiver extends MessageReceiver {
    @Override
    protected void onNotification(Context context, String title, String content, Map<String, String> map) {
        if (map != null && map.containsKey("ttsContent")){
            String ttsContent = map.get("ttsContent");
            if (!TextUtils.isEmpty(ttsContent)) {
                TTSManager.getInstance().speak(ttsContent);
            }
        }
    }
}

iOS voice announcements

On iOS, you can implement the voice announcement feature in three ways: notification extensions, pass-through messages, or silent notifications (not recommended).

Method 1: Extension Notification + Audio Splicing

Server-side push parameter settings

When sending a notification from the server, use the iOSExtParameters field to pass the voice announcement content and set iOSMutableContent to true:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setPushType("NOTICE");
pushRequest.setIOSExtParameters("{\"playVoiceText\":\"${voice_announcement_content}\"}");
pushRequest.setIOSMutableContent(true);
...

Client-side implementation of voice announcements

The client must embed basic audio files in the bundle. When a notification is received, you can intercept it in the notification extension callback to retrieve the voice announcement content. Then, splice the corresponding audio files based on the content and output the result to the App Group shared directory. To make the voice announcement, set the sound identifier for the push to the spliced audio file. The steps are as follows:

1. Integrate the Notification Service Extension
Note

The Notification Service Extension is a feature introduced in iOS 10.0.

  • Open Xcode and choose File -> New -> Target -> Notification Service Extension from the menu:

  • Enter a name and click the Finish button to create the extension:

  • A NotificationService.m file is automatically generated after the extension is created.

2. Set up the App Group

To set up the App Group, see Configuring app groups. The steps are as follows:

  • Open Xcode and go to Project -> Targets -> Signing & Capabilities. Then, click + Capability in the upper-left corner:

  • In the Capabilities section, search for and add App Groups.

  • In the App Groups list, click the + button, enter a name for your App Group, and then click OK.

3. Embed audio files in the bundle

You must embed the basic audio files for playback in the bundle, such as "payment received", "0" to "9", "CNY", and "point".

Open Xcode and go to Project -> Targets -> Build Phases -> Copy Bundle Resources. Click the + button, click Add Other, and then select the audio files.

4. Splice audio files based on the voice announcement content

In the following sample code, the makeMp3FromExt method splices files from the bundle based on the input number. For example, if the input parameter cnt is 15, the method splices "1.mp3" and "5.mp3" from the bundle and writes the resulting audio file to the App Group shared directory.

#import "ApnsHelper.h"

static NSString * const GroupName = @"group.com.example.mygroup"; // Replace with your App Group identifier.

@implementation ApnsHelper

+ (NSString *)makeMp3FromExt:(double)cnt {
    NSURL *containerURL = [[NSFileManager defaultManager] containerURLForSecurityApplicationGroupIdentifier:GroupName];
    NSString *basePath = [[[containerURL absoluteString] stringByReplacingOccurrencesOfString:@"file://" withString:@""] stringByAppendingPathComponent:@"Library/Sounds/"];
    return [self mergeVoiceWithLibPath:basePath count:cnt];
}

+ (NSString *)mergeVoiceWithLibPath:(NSString *)libPath count:(double)cnt {
    [self clearFiles:libPath];

    NSMutableArray *nums = [NSMutableArray array];
    int tmp = (int)cnt;
    while (tmp > 0) {
        [nums insertObject:[NSString stringWithFormat:@"%d", tmp % 10] atIndex:0];
        tmp /= 10;
    }

    NSMutableData *mergeData = [NSMutableData data];
    for (NSString *num in nums) {
        NSURL *mp3Url = [[NSBundle mainBundle] URLForResource:num withExtension:@"mp3"];
        if (mp3Url) {
            NSData *data = [NSData dataWithContentsOfURL:mp3Url];
            if (data) {
                [mergeData appendData:data];
            }
        }
    }

    if ([mergeData length] == 0) {
        return @"";
    }

    if (![[NSFileManager defaultManager] fileExistsAtPath:libPath]) {
        NSError *error = nil;
        [[NSFileManager defaultManager] createDirectoryAtPath:libPath withIntermediateDirectories:YES attributes:nil error:&error];
        if (error) {
            NSLog(@"Failed to create the Sounds file: %@", libPath);
        }
    }

    NSString *fileName = [NSString stringWithFormat:@"%d.mp3", [self now]];
    NSURL *fileUrl = [NSURL fileURLWithPath:[libPath stringByAppendingPathComponent:fileName]];
    NSError *writeError = nil;
    [mergeData writeToURL:fileUrl options:NSDataWritingAtomic error:&writeError];
    if (writeError) {
        NSLog(@"Failed to synthesize the mp3 file: %@", fileUrl);
    }
    return fileName;
}

+ (void)clearFiles:(NSString *)libPath {
    BOOL isDir = NO;
    if ([[NSFileManager defaultManager] fileExistsAtPath:libPath isDirectory:&isDir] && isDir) {
        NSError *error = nil;
        NSArray *list = [[NSFileManager defaultManager] contentsOfDirectoryAtPath:libPath error:&error];
        if (error) {
            NSLog(@"Failed to get directory content: %@", error.localizedDescription);
            return;
        }
        int before = [self now] - 12 * 60 * 60 * 1000; // 12 hours ago
        for (NSString *file in list) {
            NSString *timeStr = [file stringByReplacingOccurrencesOfString:@".mp3" withString:@""];
            int time = [timeStr intValue];
            if (time < before) {
                NSURL *fileUrl = [NSURL fileURLWithPath:[libPath stringByAppendingPathComponent:file]];
                NSError *removeError = nil;
                [[NSFileManager defaultManager] removeItemAtURL:fileUrl error:&removeError];
                if (removeError) {
                    NSLog(@"Failed to delete the expired mp3 file.");
                }
            }
        }
    }
}

+ (int)now {
    return (int)([[NSDate date] timeIntervalSince1970] * 1000);
}

@end
5. Make the voice announcement in the notification extension callback

In the didReceiveNotificationRequest callback of NotificationService.m, you can intercept the notification to retrieve the voice announcement content. Call the makeMp3FromExt audio splicing method. Then, set the sound identifier for the push to the spliced audio file to make the voice announcement:

@implementation NotificationService

- (void)didReceiveNotificationRequest:(UNNotificationRequest *)request 
                   withContentHandler:(void (^)(UNNotificationContent * _Nonnull))contentHandler {
    // Get the content to be announced.
    NSString *text = self.bestAttemptContent.userInfo[@"playVoiceText"];
    double cnt = [text doubleValue];
    NSString *soundName = [ApnsHelper makeMp3FromExt:cnt];
    UNNotificationSound *sound = [UNNotificationSound soundNamed:soundName];
    self.bestAttemptContent.sound = sound;
    self.contentHandler(self.bestAttemptContent);
}

@end

Method 2: Pass-through message + AVSpeechSynthesizer speech synthesis

Server-side push parameter settings

When sending a pass-through message from the server, pass through the voice announcement content:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setPushType("MESSAGE");
pushRequest.setBody("${voice_announcement_content}");
...

Client-side implementation of voice announcements

After the client receives the pass-through message, you can intercept the message in the callback to retrieve the voice announcement content. Then, use AVSpeechSynthesizer to make the voice announcement. The steps are as follows:

1. Encapsulate AVSpeechSynthesizer
@interface Tool()<AVSpeechSynthesizerDelegate>

@property (nonatomic, strong) AVSpeechSynthesizer *synthesizer;
@property (nonatomic, strong) NSMutableArray<NSString *> *textQueue;

@end

@implementation Tool

+ (instancetype)sharedManager {
    static Tool *sharedInstance = nil;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        sharedInstance = [[self alloc] init];
    });
    return sharedInstance;
}

- (instancetype)init {
    self = [super init];
    if (self) {
        _synthesizer = [[AVSpeechSynthesizer alloc] init];
        _synthesizer.delegate = self;
        _textQueue = [NSMutableArray array];
    }
    return self;
}

- (void)enqueueTextForSpeech:(NSString *)text {
    [self.textQueue addObject:text];
    [self playNextTextIfAvailable];
}

- (void)playNextTextIfAvailable {
    if (!self.synthesizer.isSpeaking && self.textQueue.count > 0) {
        NSString *nextText = [self.textQueue firstObject];
        [self.textQueue removeObjectAtIndex:0];

        AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:nextText];
        utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"zh-CN"];
        utterance.rate = 0.5f;
        utterance.pitchMultiplier = 1.0;

        [self.synthesizer speakUtterance:utterance];
    }
}

#pragma mark - AVSpeechSynthesizerDelegate

- (void)speechSynthesizer:(AVSpeechSynthesizer *)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance *)utterance {
    [self playNextTextIfAvailable];
}

@end
2. Make the voice announcement in the pass-through message callback

For more information, see Message handling interfaces. In the onMessageReceived callback, retrieve the voice announcement content from the pass-through message, and then call the voice announcement method of AVSpeechSynthesizer:

#pragma mark Receive Message
/**
 *    @brief    Register a listener for incoming push messages.
 */
- (void)registerMessageReceive {
    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(onMessageReceived:)
                                                 name:@"CCPDidReceiveMessageNotification"
                                               object:nil];
}

/**
 *    Handle incoming push messages.
 */
- (void)onMessageReceived:(NSNotification *)notification {
    NSLog(@"Receive one message!");

    CCPSysMessage *message = [notification object];
    NSString *title = [[NSString alloc] initWithData:message.title encoding:NSUTF8StringEncoding];
    NSString *body = [[NSString alloc] initWithData:message.body encoding:NSUTF8StringEncoding];
    NSLog(@"Receive message title: %@, content: %@.", title, body);
    
    [[Tool sharedManager] enqueueTextForSpeech:body];
}

Method 3: Silent notification + AVSpeechSynthesizer speech synthesis (Not recommended)

Server-side push parameter settings

When sending a silent notification from the server, use the iOSExtParameters field to pass the voice announcement content and set iOSSilentNotification to true:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setPushType("NOTICE");
pushRequest.setIOSExtParameters("{\"playVoiceText\":\"${voice_announcement_content}\"}");
pushRequest.setiOSSilentNotification("true");
...

Client-side implementation of voice announcements

After the client receives the silent notification, you can intercept it in the callback to retrieve the voice announcement content. Then, use AVSpeechSynthesizer to make the voice announcement. The steps are as follows:

1. Select Remote notifications

For more information about adding Background Modes and selecting Remote notifications, see iOS silent notifications.

2. Encapsulate AVSpeechSynthesizer

For the steps, see the 1. Encapsulate AVSpeechSynthesizer section in Method 2.

3. Make the voice announcement in the silent notification callback

In the didReceiveRemoteNotification callback, retrieve the voice announcement content from the silent notification, and then call the voice announcement method of AVSpeechSynthesizer:

@implementation AppDelegate

/// Silent notification callback method.
- (void)application:(UIApplication *)application didReceiveRemoteNotification:(NSDictionary *)userInfo fetchCompletionHandler:(void (^)(UIBackgroundFetchResult))completionHandler {
    NSLog(@"Receive one notification.");

    NSString *text = userInfo[@"playVoiceText"];
    if (text && text.length > 0) {
        NSLog(@"Content to be announced: %@", text);
        [[Tool sharedManager] enqueueTextForSpeech:text];
    }

    completionHandler(UIBackgroundFetchResultNewData);
}

@end

HarmonyOS voice announcements

Method 1: Notification extension + TTS speech synthesis

On HarmonyOS, you can use notification extension messages to deliver voice announcement content and the native Text-to-Speech (TTS) API on the client to synthesize it into speech.

Server-side push parameter settings

When sending a notification from the server, specify the voice announcement content in the HarmonyExtensionExtraData field and set HarmonyExtensionPush to true:

PushRequest pushRequest = new PushRequest();
...
pushRequest.setPushType("NOTICE");
pushRequest.setHarmonyExtensionExtraData("${voice_announcement_content}");
pushRequest.setHarmonyExtensionPush("true");
...

Client-side implementation of voice announcements

After the client receives a notification, intercept it in the notification extension message callback. In the callback, retrieve the voice announcement content. Then, use TTS to play the announcement. The steps are as follows:

1. Create a TTS engine instance

Create a Text-to-Speech (TTS) engine instance. For more information, see Text to speech.

import { textToSpeech } from '@kit.CoreSpeechKit';
import { BusinessError } from '@kit.BasicServicesKit';

let ttsEngine: textToSpeech.TextToSpeechEngine;

// Set the parameters for creating the engine.
let extraParam: Record<string, Object> = {"style": 'interaction-broadcast', "locate": 'CN', "name": 'EngineName'};
let initParamsInfo: textToSpeech.CreateEngineParams = {
  language: 'zh-CN',
  person: 0,
  online: 1,
  extraParams: extraParam
};

// Call the createEngine method.
textToSpeech.createEngine(initParamsInfo, (err: BusinessError, textToSpeechEngine: textToSpeech.TextToSpeechEngine) => {
  if (!err) {
    console.info('Succeeded in creating engine');
    // Receive the instance of the created engine.
    ttsEngine = textToSpeechEngine;
  } else {
    console.error(`Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
  }
});
2. Make the voice announcement in the notification extension message callback

For more information, see Notification extension messages. In the notification extension message callback, parse the message to obtain an instance of the ExtensionNotification class. The extensionExtraData field in this instance contains the voice announcement content from the server-side HarmonyExtensionExtraData parameter. Then, call the voice announcement API of the Text-to-Speech (TTS) engine:

// Set the announcement-related parameters.
let extraParam: Record<string, Object> = {"queueMode": 0, "speed": 1, "volume": 2, "pitch": 1, "languageContext": 'zh-CN',  
"audioType": "pcm", "soundChannel": 3, "playType": 1 };
let speakParams: textToSpeech.SpeakParams = {
  requestId: '123456', // The requestId can be used only once within the same instance. Do not set it repeatedly.
  extraParams: extraParam
};

// Call the announcement method.
// Developers can actively set the announcement policy by modifying speakParams.
// Assume that extensionNotification is an instance of the ExtensionNotification class parsed in the notification extension message callback.
ttsEngine.speak(extensionNotification.extensionExtraData, speakParams);