Stress testing on vector indexes-OpenSearch(Open Search)-阿里云帮助中心

Prerequisites

We recommend that you run the test in a Virtual Private Cloud (VPC) environment.
- For more information, see Virtual Private Cloud (VPC).
- To test over the public internet, you must add your IP address to the public access whitelist. For details, see Configure a public access whitelist.
Download the gist-960-euclidean dataset:
- Go to https://github.com/erikbern/ann-benchmarks and use the scripts in the repository to download the gist-960-euclidean.hdf5 dataset file.

Go to http://corpus-texmex.irisa.fr/.

Find the ANN_GIST1M dataset and download gist.tar.gz (approximately 2.6 GB).

After decompressing the file, use the gist_base.fvecs file in the directory.

Python 3 and the following required libraries:

h5py
json
numpy
sklearn
alibabacloud_ha3engine_vector

Generate data

Use the prepare_data.py script to format the vector data. The script supports the hdf5, fvecs, bvecs, and ivecs formats. This topic uses the hdf5 format as an example.

python3 prepare_data.py -i ./gist-960-euclidean.hdf5

The script creates a data/ subdirectory in the current directory.
- The output file is gist-960-euclidean.hdf5.data.
- Verify that the number of rows in the generated data is correct.

wc -l data/gist-960-euclidean.hdf5.data
1000000 gist-960-euclidean.hdf5.data

Purchase an OpenSearch Vector Search Edition instance

For more information about how to purchase an instance, see Purchase an OpenSearch Vector Search Edition instance.

Create a table

For more information, see one of the following topics based on your data source:

Push data

Use the push_data.py script to push data to your instance.
Parameters
- -i: Specifies the path to the file that you want to push.
- -t: The table name.
- -u: Your username.
- -p: Your password.
- -e: The instance ID.

python3 push_data.py -i data/gist-960-euclidean.hdf5.data  -t gist -u ${user_name} -p ${password} -e ${instance_id}

Generate queries

Use the prepare_query.py script to randomly generate queries from the original dataset.

python3 prepare_query.py -i gist-960-euclidean.hdf5 -c 10000 -t gist

The script creates a query.data file in the data/ directory.

Stress test with wrk

wrk is an open-source HTTP stress testing tool: https://github.com/wg/wrk.
Download wrk from GitHub.

git clone https://github.com/wg/wrk.git

Use the search.lua script for testing.
- Copy the script to the scripts directory.

cp search.lua wrk/scripts/

Calculate your request signature and update the header["Authorization"] value in the script's request function.

-- During the test, wrk randomly selects a query and constructs a request.
request = function ()
  local query = query_table[count]
  count = (count + 1)%query_count
  local headers = {}
  headers["Authorization"] = "Basic xxxx" -- Signature information
  headers["Content-Type"] = "application/json"
  return wrk.format("POST", nil, headers, query)
end

Start the stress test.
- -c: The number of concurrent connections to maintain.
- -t: The number of threads to use.
- -d: The duration of the test.
- -s: The path to the Lua script.
- --latency: Prints detailed latency statistics.

./wrk -c24 -d100s -t8 -s scripts/search.lua http://ha-cn-xxxxxx.ha.aliyuncs.com/vector-service/query --latency

View metrics

View metrics such as recall and latency.

For more information, see Authorize RAM users to view instance monitoring metrics.

Script downloads

Performance benchmark data

The following table shows benchmark results for OpenSearch Vector Search Edition with the ANN_GIST1M 960-dimensional dataset on an 8th-generation instance.

Product	OpenSearch (Tested with 1,000 queries from the GIST dataset. The ef parameter was tuned using the provided ground truth.)
Test set: ANN_GIST1M 960-dimensional (http://corpus-texmex.irisa.fr/)
Product version	OpenSearch Vector Search Edition 2024.11 vector_service_1.4.0_test_202411081507	OpenSearch Vector Search Edition 2024.11 vector_service_1.3.0_202410081048
Instance type	16-core 64 GB ecs.g8i.4xlarge (8th-generation)
Test tool	wrk (https://github.com/wg/wrk) wrk parameters: Threads: 10, concurrent connections: 30	wrk (https://github.com/wg/wrk) wrk parameters: Threads: 10, concurrent connections: 40
Index parameters	m: 100 ef_construction: 500
Vector algorithm	HNSW	QGRAPH (HNSW + quantization)
top10 recall@95	Query parameters: ef=60	Query parameters: ef=40: recall@94 ef=80: recall@95
	QPS: 4486.06 Latency (avg): 6.7 ms CPU utilization: 95.8%	recall 94 ef=40 QPS: 4957.48 Latency (avg): 7.6 ms CPU utilization: 93% recall 95 ef=80 QPS: 3700.74 Latency (avg): 8.19 ms CPU utilization: 92%
top10 recall@99	Query parameters: ef=170	A recall of 99 is not achieved; the maximum observed recall is 95.8.
	QPS: 2868.77 Latency (avg): 10.44 ms CPU utilization: 94%
top10 recall@99.5	Query parameters: ef=300 Threads: 10, concurrent connections: 20
	QPS: 2050.66 Latency (avg): 9.73 ms CPU utilization: 95%