This guide helps you get started with the speech services.
1. Register an account
Go to the Alibaba Cloud website. In the upper-right corner, click Register Now and follow the instructions to complete your account registration.
2. Activate the service
-
Sign in to the Alibaba Cloud website.
-
In the upper-right corner, click console.
-
Search for the Intelligent Speech Interaction service. In the search box at the top of the console, enter Intelligent Speech Interaction and select it from the search results to go to the product page.
-
Click Go to Activation. If a message at the top of the page indicates You have not activated the Intelligent Speech Interaction service, click Go to Activation to open the service activation page.
-
On the product activation page, select a service type.
-
Select Trial. If you select Trial for all services, new users can receive a 3-month free trial. The page includes services such as Audio File Recognition, Audio File Recognition (Off-peak), Audio File Recognition (Express), Sound Event Detection, Language Identification, Gender Recognition, and Speaker Recognition. For each service, you can select Trial or Commercial. The trial version is limited to 2 concurrent streams. After you select the check box for the relevant terms of service, click Activate Now.
-
Select Commercial. If you choose Commercial for a service, you are billed on a pay-as-you-go basis after activation. Intelligent Speech Interaction deducts charges from your Alibaba Cloud account balance based on your actual usage. Some services, such as Audio File Recognition (Off-peak) and Audio File Recognition (Express Edition), do not offer a trial and are available only as a commercial option. Agree to the terms of service at the bottom of the page and click Activate Now.
-
-
Agree to the terms of service and click Activate Now.
3. Billing
Trial
On the product activation page, after selecting and activating the trial services, you receive a three-month free trial. No charges are incurred while the service status is Free Trial Edition. In the Intelligent Speech Interaction console, on the Service Management and Activation page, click the Speech Synthesis tab to view the service status. After activation, the speech synthesis service defaults to the Free Trial Edition, which offers unlimited usage with a concurrency limit of two requests. You can click Upgrade to Commercial Edition.
Commercial
On the product activation page, if you select and activate a commercial service or click Upgrade to Commercial Edition, the service status changes to Commercial Edition and billing begins. You will be charged for using speech recognition and speech synthesis in the console, and for API calls.
Intelligent Speech Interaction bills you daily based on actual usage, with charges deducted from your Alibaba Cloud account balance.
Resource plan deduction rules
If you have purchased a resource plan, you can use it directly in the Intelligent Speech Interaction console. For more information about resource plan pricing, see Billing.
On the product specifications page, the Audio File Recognition service offers four resource plan options: 40 hours (from ¥100.00/year), 1,000 hours (from ¥1,200.00/year), 20,000 hours (from ¥20,000.00/year), and 100,000 hours (from ¥90,000.00/year). Click Buy Now to make a purchase.
How to use a resource plan:
-
New users: When your service is in the Free Trial Edition, usage is covered by the free trial first and does not consume your resource plan.
-
Existing users: After the Free Trial Edition expires, upgrade to the Commercial Edition to continue using the service. After the upgrade, usage is deducted from your resource plan first. When the service status shows Trial Expired, click Upgrade to Commercial Edition in the Actions column for the corresponding service.
-
When a resource plan expires or is depleted: If you continue to use the Commercial Edition, Intelligent Speech Interaction switches to the pay-as-you-go billing method and charges your Alibaba Cloud account for your actual usage.
For example, if the status of the Audio File Recognition service is Commercial Edition and the usage limit is Unlimited, it means no resource plan is active, and you are being billed on a pay-as-you-go basis. If the status of Audio File Recognition (Express Edition) is Commercial Edition and the usage limit is 40/40 hours, it means you have 40 hours remaining in your resource plan. Usage is deducted from the resource plan first. Once the plan is depleted, billing switches to the pay-as-you-go model.
Disable the service
If you want to stop using a service and avoid further charges, click Disable Commercial Edition in the Actions column for that service. This action switches the service back to Trial status, and no further fees will be incurred.
Note: Reverting from the Commercial Edition to the Free Trial Edition immediately disables the production service.
4. Use the console
-
Create a project
-
Log on to the Intelligent Speech Interaction console.
-
The Overview page opens by default. In the left-side navigation pane, click All Projects.
-
On the My Projects page, click Create Project. In the dialog box that appears, set a Project Name (for example,
Test001), select a Project Type (options include Speech Recognition + Speech Synthesis + Speech Analysis, Speech Recognition Only, Speech Synthesis Only, or On-device Solution), enter a Project Scenario Description, and then click OK.
-
-
Configure a project
On the All Projects page of the Intelligent Speech Interaction console, click Configure Project Features in the Actions column for your project.
-
Speech Recognition (Speech-to-Text)
In the Speech Recognition section, click Configure. After you select a language, click the microphone button in the bottom-right corner to start recognition. When the recognition is complete, click Confirm and Use. In the Intelligent Speech Interaction console, open the target project and navigate to the language recognition model selection page. Select the target model, such as Chinese Mandarin (Shiyinshi V1 - End-to-End Model), and use the microphone button on the test panel on the right to perform a speech test. After you confirm the results, click Confirm and Use to complete the model configuration.
-
Speech Synthesis (Text-to-Speech)
In the Speech Synthesis section, click Configure. After you select a voice, enter text in the text box on the right and click the speaker icon in the lower-right corner to start the synthesis. After the synthesis is complete, click Confirm. On the Select a Speech Synthesis Model page, select a category from the voice category list on the left (such as General, Customer Service, or Livestreaming), and then select a target voice from the voice grid on the right (such as Aiqi). Adjust the basic parameters (Speech Rate, Pitch, Volume, Sample Rate, and Format) as needed. In the test text input area, enter text, and then click the speaker icon to preview the audio or click the download icon to save the audio to your local computer. After you confirm that the settings are correct, click Confirm to complete the configuration.
-
5. Free trial features
The following services are available only in the commercial edition and do not support free trial: long text speech synthesis, audio file recognition (off-peak), audio file recognition (express), streaming text-to-speech (CosyVoice), and VoiceChat. To use these services, go to the Service Management page and upgrade the required services to the commercial edition.
|
Service |
Trial benefits |
Actions |
|
speech recognition |
|
Upgrade microphone and audio file services to the Commercial Edition
|
|
speech synthesis |
Synthesis and download: Unlimited daily use. |
Upgrade speech synthesis to the Commercial Edition
|
6. View billing details
-
Sign in to the Alibaba Cloud website.
-
In the upper-right corner, click console.
-
In the console, click Billing in the top menu bar.
-
In the left-side navigation pane, choose Billing Management > Bill Details. You can view your spending details on the Transactions, Detailed Bills, Usage Details, and Product Pricing Summary tabs.
-
(Optional) View the cost details for a specific service.
-
Select the Detailed Bills tab.
-
For Statistic Item, select Billing Item. For Billing Cycle, select Details. You can then view the cost details for specific services by using the Unit Price Factor and Billing Factor fields.
-
-
(Optional) Export a bill.
Select the Detailed Bills tab, click Export Bill (CSV), and choose to export all bill content or only the filtered content. Enter the verification code and click Confirm. You can open the exported CSV file in Excel to view detailed fields such as Unit Price Factor and Billing Formula.
-
(Optional) Defer service suspension.
Services are suspended if your Alibaba Cloud account balance is insufficient. To prevent immediate suspension, enable the Overdue Payment Service on the Alibaba Cloud Billing home page to delay suspension for up to seven days. At the bottom of the Billing Management home page, turn on the Available Credit Alert and Enable Service Suspension Deferral switches, and set the Deferral Quota (the default is ¥10.00).