Retrieval-enhanced applications (version 8.17)

更新时间:
复制 MD 格式

The growing demand for generative AI and real-time data analytics requires a high-performance, low-cost, and fully managed retrieval service. Alibaba Cloud Elasticsearch Serverless retrieval-enhanced applications (version 8.17) are built on Elasticsearch version 8.17. As a fully managed service, they combine a serverless architecture with layered scaling capabilities to support common scenarios, such as information retrieval, vector search, and semantic analysis. This topic describes the latest features, billing details, service activation, and application creation for retrieval-enhanced applications (version 8.17).

Commercialization notice

Retrieval-enhanced applications (version 8.17) will become a paid service at 18:00 on June 27, 2025 (UTC+8). After this time, charges will apply for using the service.

Note

If you no longer need the service, disable it before commercial billing starts to avoid charges. If you have any questions, submit a ticket or contact technical support through the DingTalk user group (Group ID: 72335013004).

Latest features

Retrieval-enhanced applications (version 8.17) offer significant improvements in features, elastic performance, and cost efficiency. Compared to the previous General-purpose retrieval applications (version 7.10), this new version provides more powerful features and a more flexible user experience:

  • New features: Retrieval-enhanced applications (version 8.17) are fully optimized for vector search scenarios and support features such as sparse vectors and dense vectors. For more information, see Appendix 1: Supported open source APIs.

  • Elastic performance: The fully upgraded architecture allows for more flexible elastic resource scheduling within the application quota and delivers significantly faster request response times for a more efficient performance experience.

  • Cost optimization: Compared with the pay-as-you-go model of version 7.10, version 8.17 introduces a billing method that combines reserved fixed quotas with on-demand elastic calls. You can flexibly combine these two modes to use resources more efficiently and reduce your overall costs.

Billing details

Retrieval-enhanced applications (version 8.17) are billed based on reserved fixed Compute Unit (CU) quotas, on-demand elastic CUs, and storage space consumption. For more information about billing, see Billing details.

CU selection reference

You can select specifications such as 2 CU, 4 CU, 6 CU, and 8 CU. Select a specification based on your maximum CU usage.

Retrieval-enhanced applications (version 8.17) use a read/write splitting architecture. By default, query and write CUs are allocated in a 1:1 ratio. After you enable the elastic computing feature, the maximum number of query CUs and write CUs are both 1.5 times the fixed CU quota. Different fixed CU quotas affect the application's usage limits, as shown in the following table. Adjust the quota based on your actual usage.

Note

The unit for the CU limit is CU/s.

Specification

Elastic computing feature

Elastic CU limit

Total application CU limit

Query CU limit

Write CU limit

Fixed CU quota = X CU

Enabled

2X

3X

1.5X

1.5X

Fixed CU quota = Y CU

Disabled

0

Y

0.5Y

0.5Y

Create a retrieval-enhanced application (version 8.17)

Note

When you create an ES Serverless application for the first time, the service is automatically enabled. By enabling the service, you agree to the relevant Terms of Service. After the application is created, it is billed based on the billing standards. If the application fails to be created, no fees are incurred.

  1. Create an application.

    1. Go to the Serverless application creation page and select a region for the application.

    2. On the Application Management page, click Create Application.

  2. Configure the basic information for the application.

    Enter an application name, select Retrieval-Augmented as the application type, and use the default values for the other parameters or customize them.

  3. Configure the access information for the application.

    • In this example, Network Access Method is set to Public Access, and the IP address of your local device is added to the Public Access Whitelist. This lets you access the Kibana of the Serverless application from your local device. To configure network access for the application, see Configure public or internal-facing access for a Serverless application.

    • Enter a User Password to log on to Kibana.

  4. Click Create Now.

You can view your applications on the Application Management page. Wait for the application status to change to Running, which indicates that the application has been created. You can then explore its features.

Open APIs

For more information about the APIs supported by retrieval-enhanced applications (version 8.17), see Appendix 1: Supported open source APIs.

Supported cluster and index configurations

For more information about the cluster and index configurations supported by retrieval-enhanced applications (version 8.17), see Appendix 2: Supported whitelists.

Supported plug-ins

For more information about the plug-ins supported by retrieval-enhanced applications (version 8.17), see Appendix 3: Supported plug-ins.

Monitoring Center

For detailed information about configuring monitoring and alerting services and for explanations of the metrics, see Monitoring metrics and alert configuration.