Benchmark vector index

更新时间:
复制 MD 格式

Prerequisites

Find the ANN_GIST1M dataset and download gist.tar.gz (approximately 2.6 GB).

After decompressing the file, use the gist_base.fvecs file in the directory.

  • Python 3 and the following required libraries:

h5py
json
numpy
sklearn
alibabacloud_ha3engine_vector

Generate data

  • Use the prepare_data.py script to format the vector data. The script supports the hdf5, fvecs, bvecs, and ivecs formats. This topic uses the hdf5 format as an example.

python3 prepare_data.py -i ./gist-960-euclidean.hdf5 
  • The script creates a data/ subdirectory in the current directory.

    • The output file is gist-960-euclidean.hdf5.data.

    • Verify that the number of rows in the generated data is correct.

wc -l data/gist-960-euclidean.hdf5.data
1000000 gist-960-euclidean.hdf5.data

Purchase an OpenSearch Vector Search Edition instance

For more information about how to purchase an instance, see Purchase an OpenSearch Vector Search Edition instance.

Create a table

For more information, see one of the following topics based on your data source:

Push data

  • Use the push_data.py script to push data to your instance.

  • Parameters

    • -i: Specifies the path to the file that you want to push.

    • -t: The table name.

    • -u: Your username.

    • -p: Your password.

    • -e: The instance ID.

python3 push_data.py -i data/gist-960-euclidean.hdf5.data  -t gist -u ${user_name} -p ${password} -e ${instance_id}

Generate queries

  • Use the prepare_query.py script to randomly generate queries from the original dataset.

python3 prepare_query.py -i gist-960-euclidean.hdf5 -c 10000 -t gist
  • The script creates a query.data file in the data/ directory.

Stress test with wrk

git clone https://github.com/wg/wrk.git
  • Use the search.lua script for testing.

    • Copy the script to the scripts directory.

cp search.lua wrk/scripts/
  • Calculate your request signature and update the header["Authorization"] value in the script's request function.

-- During the test, wrk randomly selects a query and constructs a request.
request = function ()
  local query = query_table[count]
  count = (count + 1)%query_count
  local headers = {}
  headers["Authorization"] = "Basic xxxx" -- Signature information
  headers["Content-Type"] = "application/json"
  return wrk.format("POST", nil, headers, query)
end
  • Start the stress test.

    • -c: The number of concurrent connections to maintain.

    • -t: The number of threads to use.

    • -d: The duration of the test.

    • -s: The path to the Lua script.

    • --latency: Prints detailed latency statistics.

./wrk -c24 -d100s -t8 -s scripts/search.lua http://ha-cn-xxxxxx.ha.aliyuncs.com/vector-service/query --latency

View metrics

Script downloads

Performance benchmark data

The following table shows benchmark results for OpenSearch Vector Search Edition with the ANN_GIST1M 960-dimensional dataset on an 8th-generation instance.

Product

OpenSearch (Tested with 1,000 queries from the GIST dataset. The ef parameter was tuned using the provided ground truth.)

Test set: ANN_GIST1M 960-dimensional (http://corpus-texmex.irisa.fr/)

Product version

OpenSearch Vector Search Edition

2024.11

vector_service_1.4.0_test_202411081507

OpenSearch Vector Search Edition

2024.11

vector_service_1.3.0_202410081048

Instance type

16-core 64 GB ecs.g8i.4xlarge (8th-generation)

Test tool

wrk (https://github.com/wg/wrk)

wrk parameters:

Threads: 10, concurrent connections: 30

wrk (https://github.com/wg/wrk)

wrk parameters:

Threads: 10, concurrent connections: 40

Index parameters

m: 100

ef_construction: 500

Vector algorithm

HNSW

QGRAPH (HNSW + quantization)

top10 recall@95

Query parameters:

ef=60

Query parameters:

ef=40: recall@94

ef=80: recall@95

QPS: 4486.06

Latency (avg): 6.7 ms

CPU utilization: 95.8%

recall 94

ef=40

QPS: 4957.48

Latency (avg): 7.6 ms

CPU utilization: 93%

recall 95

ef=80

QPS: 3700.74

Latency (avg): 8.19 ms

CPU utilization: 92%

top10 recall@99

Query parameters:

ef=170

A recall of 99 is not achieved; the maximum observed recall is 95.8.

QPS: 2868.77

Latency (avg): 10.44 ms

CPU utilization: 94%

top10 recall@99.5

Query parameters:

ef=300

Threads: 10, concurrent connections: 20

QPS: 2050.66

Latency (avg): 9.73 ms

CPU utilization: 95%