To use an AI inference model in an Elasticsearch (ES) instance, you must first create the model within the service. The creation process differs for built-in and self-deployed models. This topic describes both procedures.
Billing
This is a pay-as-you-go service. You are billed based on the number of API calls. No fees are incurred if no calls are made.
Create a model
Built-in models
Log on to the Alibaba Cloud Elasticsearch console.
In the left navigation menu, choose Elasticsearch Clusters.
Navigate to the target cluster.
In the top navigation bar, select the resource group to which the cluster belongs and the region where the cluster resides.
On the Elasticsearch Clusters page, find the cluster and click its ID.
In the navigation pane on the left, choose .
Click Initialize Model. The system automatically checks whether the AI Search Open Platform service is activated for your account.
If the service is activated, the model initialization is completed automatically.
If the service is not activated, you are redirected to the AI Search Open Platform activation page. After you activate the service, you are returned to the ES console, and the initialization is completed.
If a message indicates that Models that are not created exist. after initialization, click Create Model next to any model. The system then automatically creates all AI model services.

Parameter
Description
Model service namespace
A unit for data fencing. When you activate AI Search Open Platform for the first time, a default namespace named default is automatically created. You can also create custom namespaces. For more information, see Create a namespace.
API key
The credential for identity authentication when you make calls. For more information about how to obtain an API key, see Manage API keys.
Endpoint
ES instances provide endpoints for AI models primarily over a Virtual Private Cloud (VPC). ES instances in the China (Hangzhou), China (Shenzhen), China (Beijing), China (Zhangjiakou), and China (Qingdao) regions can call AI Search Open Platform services across regions over a VPC. To call the services over the public network, see Obtain a public endpoint.
Self-deployed models
Log on to the Alibaba Cloud Elasticsearch console.
In the left navigation menu, choose Elasticsearch Clusters.
Navigate to the target cluster.
In the top navigation bar, select the resource group to which the cluster belongs and the region where the cluster resides.
On the Elasticsearch Clusters page, find the cluster and click its ID.
In the navigation pane on the left, choose .
Activate AI Search Open Platform, perform service deployment, and create an
API keyin the AI Search Open Platform console.
On the Model Management page, choose AI Search Open Platform Models > Self-deployed Models, and then click Create Model.
Review the parameter information required to call the AI model.
NoteObtain a token: Obtain an internal-facing access token for the corresponding region.
For more information about how to call a self-deployed model from AI Search Open Platform in an ES instance, see Create a custom model inference service using the Inference API.
Call examples
Notes
Do not delete or disable the model service in AI Search Open Platform. Otherwise, the corresponding model in the ES console will become invalid.
Cross-border data compliance
Data flow: When you call the service, your business data is transferred to the China (Shanghai) region for processing.
Your responsibilities:
Ensure that the data transfer complies with local laws and regulations.
Implement data protection measures, provide privacy statements, and obtain necessary authorizations.
Ensure that the data content is legal and compliant.
Disclaimer: If you violate the preceding statements and warranties, causing any loss to Alibaba Cloud and/or its affiliates, you are liable for the corresponding compensation.
Data parsing models
Parse document content
POST _inference/doc_analyze/<inference_id>
{
"input": ["http://opensearch-shanghai.oss-cn-shanghai.aliyuncs.com/chatos/rag/file-parser/samples/GB10767.pdf"], # Can be a URL, content, or task_id
"task_settings": {
"document": {
"input_type": "url", # optional. Can be url, content, or task_id. The default is url. If you specify a task_id, the system queries the result of an asynchronous task.
"file_name": "<file_name>", # optional. Upload if the file name cannot be inferred from the URL.
"file_type": "<file_type>", # optional. Upload if the file type cannot be inferred from the file name.
},
"output": {
"image_storage" : "<image_storage>" # optional. The default is base64.
},
"is_async" : "<true or false>", # The default is false.
}
}
Extract image content
POST _inference/img_analyze/<inference_id>
{
"input": ["https://img.alicdn.com/imgextra/i1/O1CN01WksnF41hlhBFsXDNB_!!6000000004318-0-tps-1000-1400.jpg"], # Can be a URL, content, or task_id
"task_settings": {
"document": {
"input_type": "url", # optional. Can be url, content, or task_id. The default is url.
"file_name": "<file_name>", # optional. Upload if the file name cannot be inferred from the URL.
"file_type": "<file_type>", # optional. Upload if the file type cannot be inferred from the file name.
},
"is_async" : "<true or false>" # The default is false.
}
}
Segment a document
POST _inference/doc_split/<inference_id>
{
"input":"<input>"
}
Embedding models
Text embedding
POST _inference/text_embedding/<inference_id>
{
"input":[<input>]
}
Text sparse vector
POST _inference/sparse_embedding/<inference_id>
{
"input":[<input>]
}
Sorting
Sort
POST _inference/rerank/<inference_id>
{
"input": [<input_list>],
"query": "<query>"
}
Large language models
Query and analysis
POST _inference/query_analyze/<inference_id>
{
"input":"<input>",
"task_settings": {
"history": [
{
"content": "<history.content>",
"role": "<history.role>"
},
{
"content": "<history.content>",
"role": "<history.role>"
}
]
}
}
Content generation
POST _inference/completion/<inference_id>
{
"input":["<input>"]
}
For more examples of Inference API calls, see Call built-in model services of AI Search Open Platform.