The AI Search Open Platform is a service platform from Alibaba Cloud OpenSearch that connects modular service components to build custom Retrieval-Augmented Generation (RAG) pipelines for your business. You can select different models or policies and replace some components as needed. The AI Search Open Platform supports model deployment, which lets you deploy models from various sources to the platform. This topic describes how to use models deployed on the AI Search Open Platform for model inference services in ES.
Prerequisites
You have created an Alibaba Cloud ES instance of V8.15 or later (kernel version 2.1.2 or later).
You have activated the AI Search Open Platform. If the service is not activated, see the instructions in Activate Service.
Step 1. Deploy a model on the AI Search Open Platform
Go to the AI Search Open Platform. In the navigation pane, choose . The Service Deployment page appears.
Click Deploy Service. The service deployment page appears.
Configure the Basic Information for the service, and then click Deploy.
NoteDeploy the service in the same region as your ES instance. This allows ES to access the service over a private network, which provides lower latency and a more stable connection.
Parameter
Description
Service name
The service name. Enter a custom name.
Deployment Region
The deployment region. Select a region.
Model Category
The model category. Supported categories are text-embedding and text-reranker. In this example, text-embedding is selected.
Model Source
The source of the model. Only AI Search Open Platform is supported.
Select Model
The model to use. All models for the selected Model Category are supported. In this example, ops-text-embedding-002 is selected.
Wait for the service status to become Normal. Then, click Call Information for the target service to view the service endpoint and the access
Token.NoteES can access services in the same region using a private endpoint. To access services in a different region, you must use a public endpoint.
Step 2. Create a model inference service for the AI Search Open Platform in Alibaba Cloud ES
Run the following code in Kibana on your Alibaba Cloud ES instance to create the model inference service.
The methods for each type are as follows:
text_embedding type
Syntax template for model creation:
PUT _inference/text_embedding/os_deployment_emb
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"<your_api_key>",
"Token":"<your_token>"
},
"url":"<your_url>",
"path":{
"<your_path>":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"input":${input},"input_type":"${input_type}"}
"""
},
"response":{
"json_parser":{
"text_embeddings":"$.embeddings[*].embedding"
}
}
}
}
}
},
"task_settings":{
"parameters":{
"input_type":"document"
}
}
}
Example:
PUT _inference/text_embedding/os_deployment_emb
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"OS-xxx",
"Token":"xxx"
},
"url":"http://xxx.platform-pre-hangzhou.opensearch.aliyuncs.com",
"path":{
"/v3/openapi/deployments/xxx/predict":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"input":${input},"input_type":"${input_type}"}
"""
},
"response":{
"json_parser":{
"text_embeddings":"$.embeddings[*].embedding"
}
}
}
}
}
},
"task_settings":{
"parameters":{
"input_type":"document"
}
}
}
Call the model:
POST _inference/text_embedding/os_deployment_emb
{
"input":"hello"
}
POST _inference/text_embedding/os_deployment_emb
{
"input":["hello", "world"]
}
POST _inference/text_embedding/os_deployment_emb
{
"input":"hello",
"task_settings":{
"parameters":{
"input_type":"query"
}
}
}
rerank type
Syntax template for model creation:
PUT _inference/rerank/os_deployment_custom_rerank
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"<your_api_key>",
"Token":"<your_token>"
},
"url":"<your_url>",
"path":{
"<your_path>":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"docs":${input},"query":"${query}"}
"""
},
"response":{
"json_parser":{
"relevance_score":"$.scores[*]"
}
}
}
}
}
}
}
Example:
PUT _inference/rerank/os_deployment_custom_rerank
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"OS-xxx",
"Token":"xxx"
},
"url":"http://xxx.platform-pre-hangzhou.opensearch.aliyuncs.com",
"path":{
"/v3/openapi/deployments/xxx/predict":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"docs":${input},"query":"${query}"}
"""
},
"response":{
"json_parser":{
"relevance_score":"$.scores[*]"
}
}
}
}
}
}
}
Call the model:
POST _inference/rerank/os_deployment_custom_rerank
{
"input":["What is OpenSearch", "What is the AI Chat edition", "What are the advantages of the AI Chat edition"],
"query":"OpenSearch product documentation"
}
custom type
Syntax template for model creation:
PUT _inference/custom/os_deployment_custom
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"<your_api_key>",
"Token":"<your_token>"
},
"url":"<your_url>",
"path":{
"<your_path>":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"input":${input},"input_type":"${input_type}"}
"""
}
}
}
}
},
"task_settings":{
"parameters":{
"input_type":"document"
}
}
}
Example:
PUT _inference/custom/os_deployment_custom
{
"service":"alibaba-cloud-custom-model",
"service_settings":{
"secret_parameters":{
"api_key":"OS-xxx",
"Token":"xxx"
},
"url":"http://xxx.platform-pre-hangzhou.opensearch.aliyuncs.com",
"path":{
"/v3/openapi/deployments/xxx/predict":{
"POST":{
"headers":{
"Authorization": "Bearer ${api_key}",
"Token":"${Token}"
},
"request":{
"format":"string",
"content":"""
{"input":${input},"input_type":"${input_type}"}
"""
}
}
}
}
},
"task_settings":{
"parameters":{
"input_type":"document"
}
}
}
Call the model:
POST _inference/custom/os_deployment_custom
{
"input":"hello"
}
POST _inference/custom/os_deployment_custom
{
"input":["hello", "world"]
}
POST _inference/custom/os_deployment_custom
{
"input":"hello",
"task_settings":{
"parameters":{
"input_type":"query"
}
}
}