The growing demand for generative AI and real-time data analytics requires a high-performance, low-cost, and fully managed retrieval service. Alibaba Cloud Elasticsearch Serverless retrieval-enhanced applications (version 8.17) are built on Elasticsearch version 8.17. As a fully managed service, they combine a serverless architecture with layered scaling capabilities to support common scenarios, such as information retrieval, vector search, and semantic analysis. This topic describes the latest features, billing details, service activation, and application creation for retrieval-enhanced applications (version 8.17).
Commercialization notice
Retrieval-enhanced applications (version 8.17) will become a paid service at 18:00 on June 27, 2025 (UTC+8). After this time, charges will apply for using the service.
If you no longer need the service, disable it before commercial billing starts to avoid charges. If you have any questions, submit a ticket or contact technical support through the DingTalk user group (Group ID: 72335013004).
Latest features
Retrieval-enhanced applications (version 8.17) offer significant improvements in features, elastic performance, and cost efficiency. Compared to the previous General-purpose retrieval applications (version 7.10), this new version provides more powerful features and a more flexible user experience:
New features: Retrieval-enhanced applications (version 8.17) are fully optimized for vector search scenarios and support features such as sparse vectors and dense vectors. For more information, see Appendix 1: Supported open source APIs.
Elastic performance: The fully upgraded architecture allows for more flexible elastic resource scheduling within the application quota and delivers significantly faster request response times for a more efficient performance experience.
Cost optimization: Compared with the pay-as-you-go model of version 7.10, version 8.17 introduces a billing method that combines reserved fixed quotas with on-demand elastic calls. You can flexibly combine these two modes to use resources more efficiently and reduce your overall costs.
Billing details
Retrieval-enhanced applications (version 8.17) are billed based on reserved fixed Compute Unit (CU) quotas, on-demand elastic CUs, and storage space consumption. For more information about billing, see Billing details.
CU selection reference
You can select specifications such as 2 CU, 4 CU, 6 CU, and 8 CU. Select a specification based on your maximum CU usage.
Retrieval-enhanced applications (version 8.17) use a read/write splitting architecture. By default, query and write CUs are allocated in a 1:1 ratio. After you enable the elastic computing feature, the maximum number of query CUs and write CUs are both 1.5 times the fixed CU quota. Different fixed CU quotas affect the application's usage limits, as shown in the following table. Adjust the quota based on your actual usage.
The unit for the CU limit is CU/s.
Specification | Elastic computing feature | Elastic CU limit | Total application CU limit | Query CU limit | Write CU limit |
Fixed CU quota = X CU | Enabled | 2X | 3X | 1.5X | 1.5X |
Fixed CU quota = Y CU | Disabled | 0 | Y | 0.5Y | 0.5Y |
Create a retrieval-enhanced application (version 8.17)
When you create an ES Serverless application for the first time, the service is automatically enabled. By enabling the service, you agree to the relevant Terms of Service. After the application is created, it is billed based on the billing standards. If the application fails to be created, no fees are incurred.
Create an application.
Go to the Serverless application creation page and select a region for the application.
On the Application Management page, click Create Application.
Configure the basic information for the application.
Enter an application name, select Retrieval-Augmented as the application type, and use the default values for the other parameters or customize them.
Configure the access information for the application.
In this example, Network Access Method is set to Public Access, and the IP address of your local device is added to the Public Access Whitelist. This lets you access the Kibana of the Serverless application from your local device. To configure network access for the application, see Configure public or internal-facing access for a Serverless application.
Enter a User Password to log on to Kibana.
Click Create Now.
You can view your applications on the Application Management page. Wait for the application status to change to Running, which indicates that the application has been created. You can then explore its features.
Open APIs
For more information about the APIs supported by retrieval-enhanced applications (version 8.17), see Appendix 1: Supported open source APIs.
Supported cluster and index configurations
For more information about the cluster and index configurations supported by retrieval-enhanced applications (version 8.17), see Appendix 2: Supported whitelists.
Supported plug-ins
For more information about the plug-ins supported by retrieval-enhanced applications (version 8.17), see Appendix 3: Supported plug-ins.
Monitoring Center
For detailed information about configuring monitoring and alerting services and for explanations of the metrics, see Monitoring metrics and alert configuration.