文档

向量分析性能测试

更新时间:

本文介绍云原生数据仓库 AnalyticDB PostgreSQL 版向量分析的性能测试。

测试环境

云原生数据仓库 AnalyticDB PostgreSQL 版实例与客户端ECS应处于同一VPC中,以避免网络波动带来的误差。

AnalyticDB PostgreSQL服务端规格

引擎版本

高性能版节点规格

计算节点数量

计算节点存储空间

计算节点存储类型

v6.6.1.0

16C64G

4个

1000 GB

ESSD 云盘 PL1

8C32G

4C16G

客户端ECS规格

CPU

内存

磁盘

16 核

32 GB

2 TB

准备工作

准备测试环境

  1. 本地安装3.8及以上版本的Python环境。

  2. 下载适配云原生数据仓库 AnalyticDB PostgreSQL 版的ann-benchmark测试工具到本地。下载链接,请参见ann-benchmark

  3. 执行如下语句,安装测试工具依赖。

    pip install -r requirements.txt 
  4. 安装20版本以上的Docker。具体操作,请参见Docker官方安装指南

  5. 执行以下语句,构建测试镜像。

    python install.py --proc 4 --algorithm adbpg

准备测试数据集

下载所需的数据集,将数据集放置于ann-benchmarks项目的data目录下。

数据集

维度

样本数

度量函数

dataset参数

下载地址

GIST

960

1,000,000

L2相似度

gist-960-euclidean

GIST

SIFT-10M

128

10,000,000

L2相似度

sift-128-euclidean

SIFT-10M

SIFT-100M

128

100,000,000

L2相似度

sift100m-128-euclidean

SIFT-100M

Deep

96

10,000,000

余弦相似度

deep-image-96-angular

Deep

Cohere

768

1,000,000

L2相似度

cohere-768-euclidean

Cohere

Dbpedia

1536

1,000,000

余弦相似度

dbpedia-openai-1000k-angular

Dbpedia

Glove

200

1,180,000

余弦相似度

glove-200-angular

Glove

测试流程

步骤一:配置测试工具连接信息

编辑测试工具中ann_benchmarks/algorithms/adbpg/module.py文件,根据实际情况填写配置信息:

# AnalyticDB PostgreSQL实例的内网地址。
self._host = 'gp-bp10ofhzg2z****-master.gpdb.rds.aliyuncs.com'

# AnalyticDB PostgreSQL实例的端口号。
self._port = 5432

# AnalyticDB PostgreSQL实例的数据库名称。
self._dbname = '<database_name>'

# AnalyticDB PostgreSQL实例的账号。
self._user = '<user_name>'

# AnalyticDB PostgreSQL实例的账号密码。
self._password = '<your_password>'

步骤二:配置测试参数

根据测试数据集,编辑测试工具中ann_benchmarks/algorithms/adbpg/config.yml文件。

float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12}]
        query_args: [[ 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25},
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, 
        {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]

arg_groups:创建索引的相关参数。如何创建向量索引,请参见创建向量索引

参数名

说明

M

HNSW索引的M值参数。M越大,构建越慢,构建精度越高。

efConstruction

HNSW索引用于控制搜索质量。

parallel_build

构建索引的并行度,一般设置为计算节点的CPU数量。

external_storage

设置缓存索引方式,取值说明:

  • 1:使用mmap缓存索引。

  • 0:使用shared_buffer缓存索引。

pq_enable

是否开启PQ,取值说明:

  • 1:开启PQ。

  • 0:不开启PQ。

pq_segments

PQ切分的segment数量,一般取向量维度dim/8

query_args:检索相关参数。

参数名

说明

ef_search

HNSW索引中控制搜索过程候选最近邻数量。

max_scan_points

控制索引最多检索的样本数。

pq_amp

开启PQ时的检索放大系数,在非PQ时不起作用。

parallel

检索的并发数,仅在Batch模式中生效。

在测试过程中,需要对上述参数进行微调,以保证95%的召回率。对于上述的测试数据集,云原生数据仓库 AnalyticDB PostgreSQL 版提供以下配置供参考,可根据相应的数据集选取对应的参数配置。

# for sift 128 10m
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 700, pq_amp: 10, parallel: 50}]]



# for gist 960
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 120}]
        query_args: [[ {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 2000, pq_amp: 10, parallel: 50}]]


# for deep 96
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 12}]
        query_args: [[ {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1200, pq_amp: 10, parallel: 50}]]


# for sift 128 100M
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 16}]
        query_args: [[ {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 1800, pq_amp: 10, parallel: 50}]]


# for glove 200
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 25}]
        query_args: [[ {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 30000, pq_amp: 10, parallel: 50}]]


# for cohere 768
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 96}]
        query_args: [[ {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 600, pq_amp: 10, parallel: 50}]]


# for dbpedia 1536
float:
  any:
  - base_args: ['@metric']
    constructor: ADBPG
    disabled: false
    docker_tag: ann-benchmarks-adbpg
    module: ann_benchmarks.algorithms.adbpg
    name: adbpg
    run_groups:
      pq_mmap:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]
      pq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]
      wopq_sb:
        arg_groups: [{M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 192}]
        query_args: [[ {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 1}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 5}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 10}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 15}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 20}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 25}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 30}, {ef_search: 400, max_scan_points: 425, pq_amp: 10, parallel: 50}]]

步骤三:测试检索召回率

完成上述参数配置后,执行以下命令进行召回率测试:

nohup python run.py --algorithm adbpg --dataset <数据集> --runs 1 --timeout 990000 
> annbenchmark_deep.log 2>&1 &
说明

dataset:需要替换为具体测试数据集。

等待测试结束后,执行以下命令以查看召回率结果:

python plot.py --dataset <数据集>  --recompute

输出结果示例:

0:    ADBPG(m=64, ef_construction=600, ef_search=400, max_scan_point=500, pq_amp=10)        recall: 0.963       qps: 126.200
1:   ADBPG(m=64, ef_construction=600, ef_search=400, max_scan_point=1000, pq_amp=10)        recall: 0.992       qps: 122.665

检查召回率是否符合预期,若不符合,需要调节参数并重新执行测试。

步骤四:测试检索性能

在完成召回率调整后,即可进行性能测试,方法与召回率测试类似,但在此环节中,需要打开Batch模式,以检测并发性能:

nohup python run.py --algorithm adbpg --dataset <数据集> --runs 1 --timeout 990000 --
batch > annbenchmark_deep.log 2>&1 &

等待测试运行结束,查看输出文件annbenchmark_deep.log,可以查看不同并发下的QPS、平均RT及P99 RT表现:

2023-12-20 17:31:39,297 - INFO - query using 25 parallel
worker 0 cost 9.50 s, qps 315.92, mean rt 0.00317, p99 rt 0.00951
2023-12-20 17:31:49,097 - INFO - QPS: 7653.155
2023-12-20 17:31:49,113 - INFO - query using 30 parallel
worker 0 cost 13.87 s, qps 216.36, mean rt 0.00462, p99 rt 0.04298
2023-12-20 17:32:03,260 - INFO - QPS: 6361.819
2023-12-20 17:32:03,281 - INFO - query using 50 parallel
worker 0 cost 20.78 s, qps 144.36, mean rt 0.00693, p99 rt 0.02735
2023-12-20 17:32:24,385 - INFO - QPS: 7107.920

测试结果

下文提供了不同数据集在不同向量数据库配置中的性能表现结果,所有的召回率已经调节至95%,测试过程中检索统一取Top10。其中不同索引构建模式说明如下:

索引构建模式

说明

适用场景

PQ + mmap

采用mmap向量索引的缓存与持久化存储,并且使用PQ量化方式压缩向量编码和加速向量计算。

数据量超过50w且不需要更新和删除数据。

PQ + shared_buffer

采用PostgreSQL原生的shared_buffer机制进行向量索引的缓存,并且使用PQ量化方式压缩向量编码和加速向量计算。

数据量超过50w且需要更新和删除数据。

noPQ + shared_buffer

采用PostgreSQL原生的shared_buffer机制进行向量索引的缓存,不进行PQ量化压缩向量编码和加速向量计算。

数据量小于50w。

实例规格:16C64G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1,pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

261

1

242

3

4

5

1068

4

6

10

1673

5

7

15

2158

6

7

20

2492

7

13

25

2405

9

19

30

2423

11

24

50

2453

19

43

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

505

1

161

5

7

5

771

5

8

10

1111

8

10

15

1380

10

12

20

1395

13

23

25

1422

16

31

30

1412

20

39

50

1450

34

70

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1102

1

130

7

10

5

425

11

14

10

678

14

16

15

901

16

19

20

963

20

31

25

979

24

38

30

978

30

48

50

969

51

98

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1515

1

451

2

2

5

2279

2

3

10

4059

2

5

15

6274

2

5

20

8198

2

5

25

8347

3

6

30

8411

3

8

50

8178

6

22

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2824

1

361

2

3

5

1439

3

4

10

2261

4

5

15

2958

5

6

20

3164

6

11

25

3168

7

15

30

3170

9

19

50

3156

15

36

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3141

1

423

2

2

5

1240

3

4

10

2453

4

4

15

3144

4

6

20

3322

6

11

25

3332

7

15

30

3350

8

19

50

3347

14

35

数据集:SIFT L2 (128 * 100M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 2100, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

8805

1

441

2

6

5

2222

2

6

10

3528

2

7

15

4679

3

8

20

5358

3

9

25

5426

4

10

30

5527

5

15

50

5904

8

30

索引构建模式: PQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 2100, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

14763

1

282

3

4

5

1180

4

5

10

1769

5

7

15

2298

6

8

20

2438

8

14

25

2510

9

18

30

2472

12

22

50

2467

20

42

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

378

1

174

5

8

5

905

5

7

10

1431

6

9

15

1891

7

9

20

1921

10

21

25

1880

13

26

30

1862

15

32

50

1998

24

50

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

715

1

83

11

14

5

351

14

20

10

495

20

25

15

635

23

29

20

628

31

50

25

613

40

67

30

605

49

83

50

576

86

151

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

891

1

52

18

24

5

233

21

30

10

335

29

45

15

430

34

48

20

437

45

77

25

426

58

101

30

416

71

124

50

405

123

215

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

285

1

317

2

3

5

1450

2

4

10

2177

4

5

15

2812

4

6

20

3275

5

9

25

3485

6

14

30

3657

7

18

50

3619

13

35

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

518

1

223

4

4

5

996

4

6

10

1548

6

7

15

2006

7

8

20

2141

8

15

25

2081

11

22

30

2160

13

27

50

2039

24

51

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

701

1

357

2

3

5

641

7

9

10

1019

9

10

15

1330

10

13

20

1431

13

20

25

1437

16

26

30

1449

20

33

50

1445

34

66

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1120

1

122

7

8

5

548

8

14

10

727

12

16

15

940

15

16

20

1011

18

30

25

1022

23

38

30

1019

28

50

50

1022

47

93

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1642

1

110

8

9

5

512

9

16

10

666

14

18

15

830

17

18

20

911

21

32

25

919

26

41

30

922

31

52

50

915

53

97

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 16, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1130

1

308

2

6

5

575

7

9

10

939

10

11

15

1197

11

13

20

1295

14

22

25

1323

17

28

30

1328

21

36

50

1315

37

68

实例规格:8C32G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

681

1

252

3

5

5

868

5

7

10

1312

7

11

15

1305

11

22

20

1343

14

30

25

1323

18

37

30

1352

21

47

50

1305

37

70

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1349

1

123

7

10

5

423

11

13

10

582

16

27

15

611

24

44

20

603

32

59

25

605

40

76

30

596

49

99

50

594

83

151

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2158

1

92

10

11

5

365

13

15

10

532

18

28

15

546

26

44

20

558

35

63

25

554

44

78

30

556

53

94

50

543

91

170

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2781

1

559

1

2

5

2340

2

3

10

3973

2

4

15

4087

3

8

20

3705

5

18

25

3966

6

18

30

4083

7

20

50

4163

11

39

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

5658

1

333

2

3

5

1196

4

5

10

1685

5

11

15

1740

8

18

20

1745

11

25

25

1754

14

30

30

1755

17

38

50

1733

28

64

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: , pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6456

1

354

2

3

5

1292

3

4

10

1758

5

11

15

1779

8

18

20

1767

11

26

25

1778

13

33

30

1768

16

40

50

1770

28

64

数据集:SIFT L2 (128 * 10M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1613

1

523

1

2

5

2568

1

3

10

4500

2

4

15

5174

2

6

20

5045

3

11

25

4846

5

16

30

4873

6

23

50

4445

11

32

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4023

1

345

2

3

5

1169

4

5

10

1601

6

12

15

1632

9

21

20

1630

12

27

25

1604

15

33

30

1579

18

45

50

1536

32

53

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

5164

1

371

2

3

5

1238

3

5

10

1605

6

12

15

1615

9

20

20

1531

12

31

25

1585

15

36

30

1531

19

45

50

1517

32

71

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

884

1

150

6

7

5

638

7

9

10

894

11

20

15

898

16

31

20

890

22

45

25

894

27

54

30

877

34

66

50

874

57

110

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1537

1

75

13

16

5

250

19

25

10

321

30

51

15

310

48

83

20

301

66

115

25

289

86

153

30

274

109

192

50

252

197

346

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1804

1

41

23

34

5

164

30

47

10

216

46

82

15

211

70

123

20

209

95

168

25

209

119

207

30

207

144

254

50

200

248

434

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

688

1

323

2

3

5

1153

3

5

10

1767

5

9

15

1801

7

16

20

1921

9

22

25

1886

12

33

30

1820

16

38

50

1900

25

72

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1224

1

198

5

7

5

561

8

10

10

838

11

19

15

819

17

33

20

835

23

42

25

839

29

52

30

835

35

68

50

834

59

107

索引构建模式:noPQ + shared_buffer

建索引参数:

M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1411

1

130

7

8

5

532

8

10

10

777

12

20

15

795

18

32

20

798

24

46

25

810

30

57

30

812

36

67

50

815

61

108

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2214

1

125

7

9

5

374

12

15

10

506

18

30

15

504

28

51

20

509

38

69

25

500

49

90

30

507

58

104

50

509

97

169

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

7651

1

87

10

12

5

314

14

18

10

443

21

33

15

448

32

53

20

445

44

76

25

444

55

99

30

447

66

119

50

450

110

192

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 8, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2206

1

128

7

8

5

484

9

11

10

729

12

19

15

748

19

34

20

782

24

44

25

783

31

57

30

794

37

65

50

787

62

122

实例规格:4C16G * 4 segment

数据集:GIST L2 (960 * 100w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2014

1

237

3

5

5

630

7

13

10

645

15

27

15

650

22

44

20

642

30

57

25

634

39

70

30

622

47

98

50

621

79

147

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 120

搜索参数:ef_search: 400, max_scan_points: 2000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2934

1

119

7

11

5

275

17

29

10

292

33

60

15

282

52

93

20

286

69

121

25

284

87

154

30

280

106

188

50

276

180

328

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 2000

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4468

1

87

11

13

5

244

20

32

10

263

37

64

15

261

57

100

20

269

73

130

25

268

92

163

30

267

111

196

50

266

188

327

数据集:Deep1B cosine (96 * 1000w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6696

1

520

1

2

5

1679

2

5

10

1832

5

12

15

1908

7

18

20

1899

10

28

25

1920

12

33

30

1910

15

39

50

1904

26

66

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 12

搜索参数:ef_search: 400, max_scan_points: 1200, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

11677

1

302

3

4

5

814

6

11

10

858

11

23

15

870

17

34

20

870

22

48

25

864

28

57

30

866

34

70

50

848

58

119

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1200

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

14657

1

301

3

4

5

821

6

11

10

885

11

22

15

890

16

34

20

898

22

45

25

896

27

56

30

891

33

67

50

885

56

112

数据集:SIFT L2 (128 * 10M)

索引构建模式:PQ + mmap

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3004

1

577

1

2

5

1846

2

5

10

2085

4

12

15

2523

5

14

20

2599

7

18

25

2573

9

24

30

2349

12

35

50

2437

20

53

索引构建模式:PQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 16

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

6409

1

356

2

3

5

966

5

10

10

1025

9

19

15

974

15

32

20

1041

19

39

25

1032

24

49

30

981

30

69

50

987

50

104

索引构建模式:noPQ + shared_buffer

建索引参数:M: 16, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0

搜索参数:ef_search: 400, max_scan_points: 1000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

9058

1

365

2

4

5

960

5

10

10

938

10

21

15

961

15

32

20

1034

19

40

25

1005

24

52

30

1013

29

59

50

986

52

101

数据集:Glove Cosine (200 * 118w)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1810

1

127

7

11

5

340

14

26

10

341

29

54

15

343

43

83

20

340

58

109

25

341

73

141

30

342

87

165

50

343

145

265

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3234

1

72

13

19

5

167

29

52

10

158

62

115

15

172

82

167

20

147

135

237

25

143

174

306

30

140

213

374

50

128

390

663

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 25

搜索参数:ef_search: 400, max_scan_points: 30000, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3571

1

40

24

35

5

104

47

87

10

98

101

186

15

95

156

279

20

96

210

368

25

96

260

458

30

96

310

544

50

96

542

933

数据集:Cohere L2 (768 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

1603

1

208

4

7

5

504

9

17

10

543

17

32

15

543

27

50

20

522

37

76

25

540

45

83

30

542

54

97

50

535

92

166

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

3679

1

199

4

6

5

501

9

18

10

578

16

31

15

555

26

52

20

548

35

73

25

555

44

85

30

559

53

97

50

555

89

163

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 96

搜索参数:ef_search: 400, max_scan_points: 600, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

2768

1

131

7

9

5

369

13

21

10

406

24

43

15

410

36

65

20

413

48

84

25

408

60

107

30

408

73

136

50

409

121

213

数据集:Dbpedia cosine (1536 * 1M)

索引构建模式:PQ + mmap

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 1, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4700

1

111

8

14

5

249

19

31

10

230

42

77

15

240

61

104

20

236

83

150

25

241

102

173

30

238

124

214

50

239

208

376

索引构建模式:PQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 1, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

15958

1

101

9

14

5

209

22

36

10

217

45

77

15

210

70

127

20

211

93

168

25

216

114

191

30

215

138

244

50

214

232

411

索引构建模式:noPQ + shared_buffer

建索引参数:M: 64, efConstruction: 600, parallel_build: 4, external_storage: 0, pq_enable: 0, pq_segments: 192

搜索参数:ef_search: 400, max_scan_points: 425, pq_amp: 10

测试结果:

索引构建时间(s)

查询并发

QPS

mean RT(ms)

P99 RT(ms)

4541

1

230

3

8

5

383

12

19

10

393

24

43

15

389

37

66

20

386

51

89

25

381

64

115

30

385

77

138

50

389

127

229

  • 本页导读 (1)
文档反馈