RabitQ index

更新时间:
复制 MD 格式

RDS PostgreSQL supports RabitQ indexing through the pgvector extension. RabitQ compresses vectors at up to 32x ratio while boosting search throughput and maintaining high recall.

How RabitQ works

RabitQ is a vector quantization method built on three core principles:

  • High-compression quantization: RabitQ projects vectors onto a unit sphere and encodes each as a D-bit string using hypercube vertices. This achieves up to 32x compression compared to the float32 arrays used by native pgvector.

  • Fast distance computation: RabitQ reduces vector distance calculations to bitwise and popcount operations on binary strings, significantly faster than float-based arithmetic.

  • Unbiased estimator: RabitQ provides a proven theoretical bound on distance estimation error. During search, candidates are filtered by estimated distance, then a small subset is reranked using exact distances for high recall.

Prerequisites

Your RDS PostgreSQL instance must meet these requirements:

  • RDS PostgreSQL 14 or later.

  • Minor kernel version 20260330 or later.

  • pgvector 0.8.0.2 or later installed.

Get started

Alibaba Cloud extends the native ivfflat access method by integrating RabitQ quantization. IVF clustering runs first, then RabitQ quantizes each cluster's vectors. This is fully forward-compatible with existing ivfflat indexes.

Step 1: Check and upgrade the pgvector extension

Connect to the target database and check the pgvector version. Upgrade if the version is earlier than 0.8.0.2.

postgres=# \dx vector
                           List of installed extensions
  Name  | Version | Schema |                     Description
--------+---------+--------+------------------------------------------------------
 vector | 0.8.0.2 | public | vector data type and ivfflat and hnsw access methods
(1 row)

postgres=# ALTER EXTENSION vector UPDATE TO "0.8.0.2";

Step 2: Create an ivf-RabitQ index

Run the following SQL command to create an ivf-RabitQ index:

CREATE INDEX ON items USING ivfflat (embedding rabitq_vector_cosine_ops) WITH (lists=1000);

items is the target table, embedding is the vector column, and rabitq_vector_cosine_ops is the RabitQ cosine similarity operator. lists sets the number of IVF centroids. Choose a value based on your dataset size.

Step 3: Tune search parameters

ivf-RabitQ adds the following search parameters to the standard ivfflat parameters. Adjust them at session level with the SET command.

Parameter

Default

Value

Description

ivf_rabitq.epsilon

1.9

[0.1, 4.0]

Error margin coefficient for the distance estimator. Higher values increase recall.

ivf_rabitq.topk

10

[1, 32768]

Number of vectors to rerank using exact distance. Set this to the number of results you want to retrieve.

ivf_rabitq.max_rerank_scan_tuples

5000

[1, INT_MAX]

Maximum number of candidate vectors to scan for reranking.

Performance benchmarks

Test environment

Item

Description

RDS PostgreSQL instance

  • Major version: RDS PostgreSQL 17

  • Minor kernel version: 20260330

  • Instance type: pg.x4.2xlarge.1

Test tool

ann-benchmarks

Test dataset

dbpedia-openai-1000k-angular

Test parameters

  • ivfflat.lists = 1000

  • ivfflat.nprobes is increased in steps: [1, 2, 4, 8, 16, 32, 50, 100]

Index build comparison

Build time and index size comparison using identical dataset and parameters:

Index type

Build time

Index size

ivfflat (native)

95.32s

7820 MB

ivf-RabitQ

78.72s

248 MB

ivf-RabitQ reduces index size from 7820 MB to 248 MB — a compression ratio of over 31x — while also building faster.

Query performance comparison

ivf-RabitQ delivers up to 2.9x higher QPS with less than 1% recall loss compared to native ivfflat:

Parameter

Ivfflat QPS

Ivf-RabitQ QPS

Speedup ratio

Ivfflat recall

Ivf-RabitQ recall

nprobes = 1

566.36

1033.56

1.83x

66.87%

69.38%

nprobes = 2

350.77

758.88

2.16x

79.40%

80.09%

nprobes = 4

203.78

501.56

2.47x

87.84%

87.96%

nprobes = 8

110.98

298.43

2.69x

92.69%

92.80%

nprobes = 16

56.99

162.75

2.86x

95.71%

95.28%

nprobes = 32

28.56

53.98

1.89x

97.37%

96.84%

nprobes = 50

18.32

53.98

2.94x

98.05%

97.71%

nprobes = 100

9.26

26.74

2.89x

98.93%

98.54%

QPS-recall curve comparing the following index types:

  • pgvector: Native HNSW index (community pgvector).

  • pgvector_ivfflat: Native ivfflat index (community pgvector).

  • pgvector_ivfrabitq: ivf-RabitQ index.

  • pgvector_hnsw_rabitq: HNSW-RabitQ index with reranking.

  • pgvector_hnsw_rabitq_without_refine: HNSW-RabitQ index without reranking, using quantized distances directly for higher throughput.

image.png