Prerequisites
-
We recommend that you run the test in a Virtual Private Cloud (VPC) environment.
-
For more information, see Virtual Private Cloud (VPC).
-
To test over the public internet, you must add your IP address to the public access whitelist. For details, see Configure a public access whitelist.
-
-
Download the gist-960-euclidean dataset:
-
Go to https://github.com/erikbern/ann-benchmarks and use the scripts in the repository to download the
gist-960-euclidean.hdf5dataset file.
-
Find the ANN_GIST1M dataset and download gist.tar.gz (approximately 2.6 GB).
After decompressing the file, use the gist_base.fvecs file in the directory.
-
Python 3 and the following required libraries:
h5py
json
numpy
sklearn
alibabacloud_ha3engine_vector
Generate data
-
Use the
prepare_data.pyscript to format the vector data. The script supports the hdf5, fvecs, bvecs, and ivecs formats. This topic uses thehdf5format as an example.
python3 prepare_data.py -i ./gist-960-euclidean.hdf5
-
The script creates a
data/subdirectory in the current directory.-
The output file is
gist-960-euclidean.hdf5.data. -
Verify that the number of rows in the generated data is correct.
-
wc -l data/gist-960-euclidean.hdf5.data
1000000 gist-960-euclidean.hdf5.data
Purchase an OpenSearch Vector Search Edition instance
For more information about how to purchase an instance, see Purchase an OpenSearch Vector Search Edition instance.
Create a table
For more information, see one of the following topics based on your data source:
Push data
-
Use the
push_data.pyscript to push data to your instance. -
Parameters
-
-i: Specifies the path to the file that you want to push. -
-t: The table name. -
-u: Your username. -
-p: Your password. -
-e: The instance ID.
-
python3 push_data.py -i data/gist-960-euclidean.hdf5.data -t gist -u ${user_name} -p ${password} -e ${instance_id}
Generate queries
-
Use the
prepare_query.pyscript to randomly generate queries from the original dataset.
python3 prepare_query.py -i gist-960-euclidean.hdf5 -c 10000 -t gist
-
The script creates a
query.datafile in thedata/directory.
Stress test with wrk
-
wrk is an open-source HTTP stress testing tool: https://github.com/wg/wrk.
-
Download wrk from GitHub.
git clone https://github.com/wg/wrk.git
-
Use the
search.luascript for testing.-
Copy the script to the scripts directory.
-
cp search.lua wrk/scripts/
-
Calculate your request signature and update the
header["Authorization"]value in the script'srequestfunction.
-- During the test, wrk randomly selects a query and constructs a request.
request = function ()
local query = query_table[count]
count = (count + 1)%query_count
local headers = {}
headers["Authorization"] = "Basic xxxx" -- Signature information
headers["Content-Type"] = "application/json"
return wrk.format("POST", nil, headers, query)
end
-
Start the stress test.
-
-c: The number of concurrent connections to maintain. -
-t: The number of threads to use. -
-d: The duration of the test. -
-s: The path to the Lua script. -
--latency: Prints detailed latency statistics.
-
./wrk -c24 -d100s -t8 -s scripts/search.lua http://ha-cn-xxxxxx.ha.aliyuncs.com/vector-service/query --latency
View metrics
-
View metrics such as recall and latency.
For more information, see Authorize RAM users to view instance monitoring metrics.
Script downloads
Performance benchmark data
The following table shows benchmark results for OpenSearch Vector Search Edition with the ANN_GIST1M 960-dimensional dataset on an 8th-generation instance.
|
Product |
OpenSearch (Tested with 1,000 queries from the GIST dataset. The ef parameter was tuned using the provided ground truth.) |
|
|
Test set: ANN_GIST1M 960-dimensional (http://corpus-texmex.irisa.fr/) |
||
|
Product version |
OpenSearch Vector Search Edition 2024.11 vector_service_1.4.0_test_202411081507 |
OpenSearch Vector Search Edition 2024.11 vector_service_1.3.0_202410081048 |
|
Instance type |
16-core 64 GB ecs.g8i.4xlarge (8th-generation) |
|
|
Test tool |
wrk (https://github.com/wg/wrk) wrk parameters: Threads: 10, concurrent connections: 30 |
wrk (https://github.com/wg/wrk) wrk parameters: Threads: 10, concurrent connections: 40 |
|
Index parameters |
m: 100 ef_construction: 500 |
|
|
Vector algorithm |
HNSW |
QGRAPH (HNSW + quantization) |
|
top10 recall@95 |
Query parameters: ef=60 |
Query parameters: ef=40: recall@94 ef=80: recall@95 |
|
QPS: 4486.06 Latency (avg): 6.7 ms CPU utilization: 95.8% |
recall 94 ef=40 QPS: 4957.48 Latency (avg): 7.6 ms CPU utilization: 93% recall 95 ef=80 QPS: 3700.74 Latency (avg): 8.19 ms CPU utilization: 92% |
|
|
top10 recall@99 |
Query parameters: ef=170 |
A recall of 99 is not achieved; the maximum observed recall is 95.8. |
|
QPS: 2868.77 Latency (avg): 10.44 ms CPU utilization: 94% |
||
|
top10 recall@99.5 |
Query parameters: ef=300 Threads: 10, concurrent connections: 20 |
|
|
QPS: 2050.66 Latency (avg): 9.73 ms CPU utilization: 95% |
||