How it works
A prediction query uses the built-in embedding model in Vector Search Edition to convert content such as text, images, or videos into a vector, and then retrieves results based on that vector.
If you have existing vectors and want to query them directly in your Vector Search Edition instance, see vector query.
Endpoint
/vector-service/inference-query
The URL above omits request headers, encoding, and other elements.
You must prepend your instance's host address to this path.
For details about each parameter, see the "Request body parameters" section below.
Request protocol
HTTP
Request method
POST
Supported format
JSON
Authentication
Use the following method to compute the authorization value:
Parameter | Type | Description |
accessUserName | string | The username. You can find this in the Details>Network Information section of your instance. |
accessPassWord | string | The password. You can set or change this in the Details>Network Information section of your instance. |
import com.aliyun.darabonba.encode.Encoder;
import com.aliyun.darabonbastring.Client;
public class GenerateAuthorization {
public static void main(String[] args) throws Exception {
String accessUserName = "username";
String accessPassWord = "password";
String realmStr = "" + accessUserName + ":" + accessPassWord + "";
String authorization = Encoder.base64EncodeToString(Client.toBytes(realmStr, "UTF-8"));
System.out.println(authorization);
}
}Example of a correctly formatted authorization value:
cm9vdDp******mdhbA==When making an HTTP request, prefix the value with Basic and provide it in the Authorization header.
Example (in the request header):
Authorization: Basic cm9vdDp******mdhbA==Request body parameters
Parameter | Description | Default | Type | Required |
tableName | The name of the table to query. | — | string | Yes |
indexName | The name of the index to query. | The first configured index | string | No |
content | The query content. Use this for queries that do not involve fusion vector retrieval. | — | string | Yes (for non-fusion vector retrieval) |
contents | A list of query content items. Use this parameter for fusion vector retrieval, which supports content from multiple modalities. | — | list[string] | Yes (for fusion vector retrieval) |
contentType | The data type of the content. Valid values: For fusion vector retrieval, provide a comma-separated list of types, such as | — | string | No |
modal | The modality of the embedding model. Valid values: | — | string | Yes |
videoFrameTopK | The number of frames to retrieve for a video query. | 100 | int | No |
namespace | The namespace to query. | "" | string | No |
topK | The number of results to return. | 100 | int | No |
includeVector | Specifies whether to include the vector in the response. | false | bool | No |
outputFields | A list of fields to include in the response. | [] | list[string] | No |
order | The sort order for the results. Valid values: ASC for ascending, DESC for descending. | ASC | string | No |
searchParams | Algorithm-specific query parameters: | "" | string | No |
filter | A filter expression to apply to the search. | "" | string | No |
scoreThreshold | Filters results based on their score. When using Euclidean distance, returns results with a score less than | No filtering by default | float | No |
Response parameters
Field | Description | Type |
result | A list of matching items. | list[Item] |
totalCount | The number of items in the result list. | int |
totalTime | The engine processing time, in milliseconds (ms). | float |
errorCode | The error code. This field appears only when an error occurs. | int |
errorMsg | The error message. This field appears only when an error occurs. | string |
Item object definition
Field | Description | Type |
score | The distance score. | float |
fields | A map of field names and their corresponding values. | map<string, FieldType> |
vector | The vector value. | list[float] |
id | The primary key value. The type matches the defined field type. | FieldType |
namespace | The namespace of the vector. This field is returned only if a namespace is set. | string |
The API response may include additional fields, such as__source__andcoveredPercent, for internal debugging purposes. These fields do not affect business logic and can be safely ignored.
Examples
Text-to-text retrieval
Request body:
{ "tableName": "gist", "indexName": "test", "content": "hello", "modal": "text", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] }, { "id": 2, "score":1.0329746007919312, "vector": [0.2, 0.2, 0.3] }, { "id": 3, "score":0.980593204498291, "vector": [0.3, 0.2, 0.3] } ], "totalCount":3, "totalTime":2.943 }
Image retrieval
Text-to-image retrieval:
Request body:
{ "tableName": "gist", "indexName": "test", "content": "Bicycle", "modal": "text", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] }, { "id": 2, "score":1.0329746007919312, "vector": [0.2, 0.2, 0.3] }, { "id": 3, "score":0.980593204498291, "vector": [0.3, 0.2, 0.3] } ], "totalCount":3, "totalTime":2.943 }
Search by image:
Request body:
{ "tableName": "gist", "indexName": "test", "content": "base64-encoded image data", "modal": "image", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }Response:
{ "totalCount": 5, "result": [ { "id": 5, "score": 1.103209137916565 }, { "id": 3, "score": 1.1278988122940064 }, { "id": 2, "score": 1.1326735019683838 } ], "totalTime": 242.615 }
Subject identification
Request body:
Without the range parameter:
{ "tableName": "gist", "indexName": "test", "content": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQ", "modal": "image", "searchParams": "{\"crop\": true}", "topK": 3, "includeVector": true }Note:
"crop":trueenables subject identification. If therangeparameter is not provided, the model automatically detects the subject.With the range parameter:
{ "tableName": "gist", "indexName": "test", "content": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQ", "modal": "image", "searchParams": "{\"crop\": true, \"range\": \"100,100,60,70\"}", "topK": 3, "includeVector": true }"crop":true, "range":"100,100,60,70"enables subject identification within a specified region of the image. The four numbers inrangerepresent the(x, y)coordinates of the top-left corner of the region, its width, and its height.Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] } ], "__meta__": { "__range__": "100,100,60,70;", } "totalCount":1, "totalTime":2.943 }When you perform subject identification with
modal=image, the response includes the__range__field.The
__range__field indicates the detected region of the subject inx,y,width,heightformat.If the model identifies multiple subjects, their regions are listed in the
__range__field, sorted by score in descending order. The query returns results for the first (highest-scoring) subject by default.
Text-to-video retrieval
Request body:
{ "tableName": "video", "content": "hello", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"text", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, "startTime": 5, "duration": 5, "queryStartFrameIndex": 150, "queryEndFrameIndex": 300, "startFrameIndex": 150, "endFrameIndex": 300, "sim": 0.8 }] } ], "totalCount":1, "totalTime":2.943 }
Video-to-video retrieval
Supported video formats include MP4, AVI, MKV, MOV, FLV, and WebM.
Request body:
Using an OSS URI:
{ "tableName": "video", "content": "oss://...", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"video_uri", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }Using Base64-encoded video data:
{ "tableName": "video", "content": "data:video/mp4;base64,AAAAIGZ0eXBtcDQyAAABAGlxxxxxxx", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"video_encode", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }The format is
data:video/{format};base64,{base64_video}, where:video/{format}: The format of the video. For example, usevideo/mp4for an MP4 file.base64_video: The Base64-encoded video data.
Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, "startTime": 5, "duration": 5, "queryStartFrameIndex": 150, "queryEndFrameIndex": 300, "startFrameIndex": 150, "endFrameIndex": 300, "sim": 0.8 }] } ], "totalCount":1, "totalTime":2.943 }
Image-to-video retrieval
Supported image formats include PNG, JPEG, and JPG.
Request body:
{ "tableName": "video", "content": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAxxxxxx", "modal": "video", "topK": 3, "videoFrameTopK":100, "contentType":"image_encode", "searchParams":"{\"qc.searcher.scan_ratio\":0.01}" }The image is provided as Base64 data. Pass the encoded data to the
contentparameter in the formatdata:image/{format};base64,{base64_image}, where:image/{format}: The format of the image. For example, useimage/jpegfor a JPG file.base64_image: The Base64-encoded image data.
Response:
{ "result":[ { "videoId": 1, "videoUri": "oss://...", "fields" : { "tag" : "demo" }, "clips": [{ "queryStartTime": 5, "startTime": 5, "duration": 5, "queryStartFrameIndex": 150, "queryEndFrameIndex": 300, "startFrameIndex": 150, "endFrameIndex": 300, "sim": 0.8 }] } ], "totalCount":3, "totalTime":2.943 }
Fusion vector retrieval
Fusion vector retrieval encodes content from multiple modalities, such as text and images, into a single fusion vector for cross-modal retrieval. Before using this feature, you must configure the fusion vector fields in your table settings. For more information, see Configure fusion vectors.
Fusion vector retrieval differs from other prediction queries in the following ways:
The
modalparameter is set tofusion.The
contentsparameter (a list) is used instead of thecontentparameter to pass content from multiple modalities.The
contentTypeparameter contains a comma-separated list of types corresponding to the items in thecontentslist.Request body:
{ "tableName": "gist", "indexName": "test", "contents": ["hello", "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAxxxxxx"], "modal": "fusion", "contentType": "text,image_encode", "topK": 3, "searchParams":"{\"qc.searcher.scan_ratio\":0.01}", "includeVector": true }The first element in
contentsis the text "hello", and the second element is Base64-encoded image data. IncontentType,textandimage_encodecorrespond to the types of the two elements incontents, respectively.Response:
{ "result":[ { "id": 1, "score":1.0508723258972169, "vector": [0.1, 0.2, 0.3] }, { "id": 2, "score":1.0329746007919312, "vector": [0.2, 0.2, 0.3] }, { "id": 3, "score":0.980593204498291, "vector": [0.3, 0.2, 0.3] } ], "totalCount":3, "totalTime":2.943 }