How to perform a similarity search in a collection using the HTTP API-DashVector(DashVector)-阿里云帮助中心

This topic describes how to perform a similarity search in a collection using the HTTP API.

Prerequisites

A cluster has been created. For more information, see Create a cluster.
An API key has been obtained. For more information, see API key management.

Method and URL

HTTP

POST https://{Endpoint}/v1/collections/{CollectionName}/query

Examples

Note

In the examples, replace YOUR_API_KEY and YOUR_CLUSTER_ENDPOINT with your API key and cluster endpoint.
The following examples require a collection named quickstart. For more information, see Create a collection - Examples.

Perform a similarity search based on a vector

Bash

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.2, 0.3, 0.4],
    "topk": 10,
    "include_vector": true
  }' https://YOUR_CLUSTER_ENDPOINT/v1/collections/quickstart/query

# example output:
# {
#   "code": 0,
#   "request_id": "2cd1cac7-f1ee-4d15-82a8-b65e75d8fd13",
#   "message": "Success",
#   "output": [
#     {
#       "id": "1",
#       "vector":[
#         0.10000000149011612,
#         0.20000000298023224,
#         0.30000001192092896,
#         0.4000000059604645
#       ],
#       "fields": {
#         "name": "zhangshan",
#         "weight": null,
#         "age": 20,
#         "anykey": "anyvalue"
#       },
#       "score": 0.3
#     }
#   ]
# }

Perform a similarity search based on a primary key

Bash

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "id": "1",
    "topk": 1,
    "include_vector": true
  }' https://YOUR_CLUSTER_ENDPOINT/v1/collections/quickstart/query

# example output:
# {
#   "code":0,
#   "request_id":"fab4e8a2-15e4-4b55-816f-3b66b7a44962",
#   "message":"Success",
#   "output":[
#     {
#       "id":"1",
#       "vector":[
#         0.10000000149011612,
#         0.20000000298023224,
#         0.30000001192092896,
#         0.4000000059604645
#       ],
#        "fields": {
#         "name": "zhangshan",
#         "weight": null,
#         "age": 20,
#         "anykey": "anyvalue"
#       },
#       "score": 0.3
#     }
#   ]
# }

Perform a similarity search with a filter condition

Bash

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "filter": "age > 18",
    "topk": 1,
    "include_vector": true
  }' https://YOUR_CLUSTER_ENDPOINT/v1/collections/quickstart/query
  
# example output:
# {
#   "code":0,
#   "request_id":"4c7331d8-fba1-4c3a-8673-124568670de7",
#   "message":"Success",
#   "output":[
#     {
#       "id":"1",
#       "vector":[
#         0.10000000149011612,
#         0.20000000298023224,
#         0.30000001192092896,
#         0.4000000059604645
#       ],
#        "fields": {
#         "name": "zhangshan",
#         "weight": null,
#         "age": 20,
#         "anykey": "anyvalue"
#       },
#       "score": 0.0
#     }
#   ]
# }

Perform a vector search with a sparse vector

Bash

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.2, 0.3, 0.4],
    "sparse_vector":{"1":0.4, "10000":0.6, "222222":0.8},
    "topk": 1,
    "include_vector": true
  }' https://YOUR_CLUSTER_ENDPOINT/v1/collections/quickstart/query

# example output:
# {
#   "code":0,
#   "request_id":"ad84f7a0-b4b2-4023-ae80-b6f092609a53",
#   "message":"Success",
#   "output":[
#     {
#       "id":"2",
#       "vector":[
#         0.10000000149011612,
#         0.20000000298023224,
#         0.30000001192092896,
#         0.4000000059604645
#       ],
#       "fields":{"name":null,"weight":null,"age":null},
#       "score":1.46,
#       "sparse_vector":{
#         "10000":0.6,
#         "1":0.4,
#         "222222":0.8
#       }
#     }
#   ]
# }

Advanced parameters for a single-vector search

Note

For more information, see Advanced parameters for vector search.
When you use advanced parameters for a single-vector search, pay attention to the parameter names and their positions.

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.2, 0.3, 0.4], 
    "vector_param":{ "radius": 0.53, "is_linear": false, "ef": 1000 },
    "topk": 10,
    "include_vector": true
}' https://YOUR_CLUSTER_ENDPOINT/v1/collections/quickstart_euclidean/query

#example output:
#{
#    "code": 0,
#    "request_id": "59df860b-7d29-466b-a345-0bfe9e27329e",
#    "message": "Success",
#    "output": [
#        {
#            "id": "2",
#            "vector": [
#                0.20000000298023224,
#                0.30000001192092896,
#                0.4000000059604645,
#                0.5
#            ],
#            "fields": {
#                "anykey1": "str-value",
#                "anykey2": 1,
#                "name": "zhangshan",
#                "weight": null,
#                "anykey3": true,
#                "anykey4": 3.1415925,
#                "age": 70
#            },
#            "score": 0.04
#        },
#        {
#            "id": "3",
#            "vector": [
#                0.30000001192092896,
#                0.4000000059604645,
#                0.5,
#                0.6000000238418579
#            ],
#            "fields": {
#                "name": null,
#                "weight": null,
#                "age": null
#            },
#            "score": 0.16000001
#        },
#        {
#            "id": "4",
#            "vector": [
#                0.4000000059604645,
#                0.5,
#                0.6000000238418579,
#                0.699999988079071
#            ],
#            "fields": {
#                "name": "zhangsan",
#                "weight": null,
#                "age": 20
#            },
#            "score": 0.36
#        }
#    ]
#}

Advanced parameters for a multi-vector search

Note

For more information, see Multi-vector search.
When you use advanced parameters for a multi-vector search, pay attention to the parameter names and their positions.

RrfRanker example

curl -XPOST \
 -H 'dashvector-auth-token: YOUR_API_KEY' \
 -H 'Content-Type: application/json' \
 -d '{
   "vectors": {"title": {"vector": [0.1, 0.2, 0.3, 0.4]}, "content": {"vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6], "param": {"num_candidates": 10}}},
   "sparse_vectors": {"abstruct": {"sparse_vector": {"0": 0.1, "1": 0.32, "2": 0.482}}, "keywords": {"sparse_vector": {"12": 0.5, "15": 0.6, "19": 0.7}}},
   "topk": 20,
   "rerank": {"ranker_name": "weighted", "ranker_params": {"weights": "{\"title\":0.2, \"content\":0.3, \"abstruct\": 0.4, \"keywords\": 0.1}" }}
}' https://YOUR_CLUSTER_ENDPOINT/v1/collections/multi_vector_demo/query

# example output:
# {
#   "code": 0,
#   "request_id": "a931b032-60a5-4b8d-8d37-26ccaa19b649",
#   "message": "Success",
#   "output": [
#     {
#       "id": "4",
#       "fields": {
#         "author": null
#       },
#       "score": 0.019609727
#     },
#     {
#       "id": "1",
#       "fields": {
#         "author": null
#       },
#       "score": 0.019607844
#     },
#     {
#       "id": "2",
#       "fields": {
#         "author": "zhangsan"
#       },
#       "score": 0.00990099
#     },
#     {
#       "id": "3",
#       "fields": {
#         "author": null,
#         "anykey": "anyvalue"
#       },
#       "score": 0.009708738
#     }
#   ]
# }

WeightedRanker example

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {"title": {"vector": [0.1, 0.2, 0.3, 0.4]}, "content": {"vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6], "param": {"num_candidates": 10}}},
    "sparse_vectors": {"abstruct": {"sparse_vector": {"0": 0.1, "1": 0.32, "2": 0.482}}, "keywords": {"sparse_vector": {"12": 0.5, "15": 0.6, "19": 0.7}}},
    "topk": 20,
    "rerank": {"ranker_name": "weighted", "ranker_params": {"weights": "{\"title\":0.2, \"content\":0.8}" }}
}' https://YOUR_CLUSTER_ENDPOINT/v1/collections/multi_vector_demo/query

# example output:
# {
#     "code": 0,
#     "request_id": "f8261b81-fef6-42f8-bba2-70e05bda2301",
#     "message": "Success",
#     "output": [
#         {
#             "id": "1",
#             "fields": {
#                 "author": null
#             },
#             "score": 0.8156271
#         },
#         {
#             "id": "4",
#             "fields": {
#                 "author": null
#             },
#             "score": 0.8156271
#         },
#         {
#             "id": "3",
#             "fields": {
#                 "author": null,
#                 "anykey": "anyvalue"
#             },
#             "score": 0.5880098
#         },
#         {
#             "id": "2",
#             "fields": {
#                 "author": "zhangsan"
#             },
#             "score": 0.2
#         }
#     ]
# }

Perform a search using one vector from a multi-vector field

curl -XPOST \
  -H 'dashvector-auth-token: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {"title": {"vector": [0.1, 0.2, 0.3, 0.4], "param":{ "radius": 0.1, "is_linear": true, "ef": 1000 }}},
    "topk": 20,
    "include_vector": true
}' https://YOUR_CLUSTER_ENDPOINT/v1/collections/multi_vector_demo/query

# example output:
#{
#    "code": 0,
#    "request_id": "b9bc3d5a-8edf-4d5b-916d-0ced6ae570cb",
#    "message": "Success",
#    "output": [
#        {
#            "id": "4",
#            "vectors": {
#                "title": [
#                    0.10000000149011612,
#                    0.20000000298023224,
#                    0.30000001192092896,
#                    0.4000000059604645
#                ]
#            },
#            "fields": {
#                "author": "zhangsan"
#            },
#            "score": 0.0
#        },
#        {
#            "id": "2",
#            "vectors": {
#                "title": [
#                    0.10000000149011612,
#                    0.20000000298023224,
#                    0.30000001192092896,
#                    0.4000000059604645
#                ]
#            },
#            "fields": {
#                "author": "zhangsan"
#            },
#            "score": 0.0
#        }
#    ]
#}

Request parameters

Note

You must specify either the vector or id parameter. If you do not specify either parameter, the operation performs filtering only.

Parameter	Location	Type	Required	Description
{Endpoint}	path	str	Yes	The endpoint of the cluster. You can find the endpoint on the Cluster Details page in the console.
{CollectionName}	path	str	Yes	The name of the collection.
dashvector-auth-token	header	str	Yes	API key
vector	body	array	No	The vector data.
sparse_vector	body	dict	No	The sparse vector.
id	body	str	No	The primary key. Use this parameter to perform a similarity search based on the vector that corresponds to the primary key.
topk	body	int	No	The number of top similar results to return. Default: 10.
filter	body	str	No	The filter condition. The condition must conform to the SQL WHERE clause specification. For more information, see Filtered search.
include_vector	body	bool	No	Specifies whether to return vector data. Default: false.
output_fields	body	array	No	A list of field names to return. By default, all fields are returned.
partition	body	str	No	The name of the partition.
vectors	body	dict	No	The parameters for a multi-vector search. The type is `Map<String, VectorQuery>`. For more information, see Multi-vector search.
sparse_vectors	body	dict	No	The parameters for a multi-sparse-vector search. The type is `Map<String, VectorQuery>`. For more information, see Multi-vector search.
rerank	body	dict	No	The parameters for hybrid sorting. For more information, see Multi-vector search.
vector_param	body	dict	No	The advanced search parameters. For more information, see Advanced parameters for vector search.

Response parameters

Field	Type	Description	Example
code	int	The status code. For more information, see Status codes.	0
message	str	The response message.	success
request_id	str	The unique ID of the request.	19215409-ea66-4db9-8764-26ce2eb5bb99
output	array	The similarity search results. A list of Doc objects.
usage	map	For a successful document query on a collection in a serverless (pay-as-you-go) instance, this parameter returns the number of read units that are consumed.	`{ Usage: { read_units: 8 } }`