PGVector (vector retrieval)

更新时间:
复制 MD 格式

PGVector is a PolarDB for PostgreSQL extension for storing and querying high-dimensional vector embeddings. It supports Hierarchical Navigable Small World (HNSW) and Inverted File Flat (IVFFlat) indexing for approximate nearest neighbor (ANN) search, with vectors up to 16,000 dimensions. Use PGVector to build semantic search, recommendation systems, and other AI applications directly on your PolarDB cluster.

Prerequisites

Before you begin, ensure that your cluster runs one of the following engine versions:

  • PostgreSQL 16, revision version 2.0.16.3.1.1 or later

  • PostgreSQL 15, revision version 2.0.15.12.4.0 or later

  • PostgreSQL 14, revision version 2.0.14.7.9.0 or later

  • PostgreSQL 11, revision version 2.0.11.9.35.0 or later

To check your cluster's revision version, run SHOW polardb_version; or check the PolarDB console. If the version does not meet the requirements, update the revision version.

Different engine versions support different extension versions. For the full compatibility matrix, see Extensions.

Limitations

  • Cross-node parallel execution supports sequential scans with an ORDER BY clause. It does not support index scans.

  • PGVector supports vectors with up to 16,000 dimensions.

How it works

PGVector provides two indexing algorithms for ANN search:

HNSW builds a multi-layer graph structure. Queries traverse the graph from the top layer down, progressively narrowing the search. HNSW delivers higher recall and faster query performance but requires more memory and takes longer to build.

IVFFlat is a simplified Inverted File System with Asymmetric Distance Computation (IVFADC) algorithm. It uses k-means clustering to partition vectors into groups (inverted lists), each with a centroid. The three-step process:

  1. Assign vectors to clusters using k-means. Each cluster has a centroid.

  2. Find the probes nearest centroids to the query vector.

  3. Search all vectors in those clusters and return the top-k nearest results.

IVFFlat builds indexes faster and uses less memory than HNSW but delivers lower recall. It has low storage usage and is well-suited for datasets where query latency up to 100 milliseconds is acceptable.

Get started

The following steps walk through a complete example: enabling PGVector, creating a vector table and index, and querying similar vectors.

  1. Enable the PGVector extension.

    If your cluster runs PostgreSQL 17 and returns ERROR: must be superuser, contact support for troubleshooting.
    CREATE EXTENSION vector;
  2. Create a table with a vector column. The number in vector(n) sets the number of dimensions.

    CREATE TABLE t (val vector(3));
  3. Insert vector data. Load your data before creating indexes to ensure accurate clustering.

    INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
  4. Create a vector index.

    CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 1);
  5. Query the nearest vectors. The ORDER BY clause ranks results by distance from [3,3,3].

    SELECT * FROM t ORDER BY val <#> '[3,3,3]';

    Expected output:

     val
    ---------
     [1,2,3]
     [1,1,1]
     [0,0,0]
    (3 rows)

    vector_ip_ops calculates distances using inner product. WITH (lists = 1) places all vectors in a single cluster, which is appropriate for small datasets. For production datasets, set lists based on data volume (see Tune IVFFlat parameters).

Choose a search strategy

Sequential scan vs index scan

Use a sequential scan (no index) when:

  • Your dataset is small and you do not plan to scale it.

  • You need 100% recall. Indexes trade recall for performance.

  • Query volume is low and index scan performance gains are not needed.

For all other cases, create an index for better query performance.

HNSW vs IVFFlat

Factor HNSW IVFFlat
Recall Higher Moderate
Query performance Better Fast
Index build time Slower Faster
Memory usage Higher Lower
Storage usage Higher Lower
Best for Production workloads requiring high recall Large datasets with moderate recall requirements

Use HNSW when recall and query performance are the priority. Use IVFFlat when index build time and memory footprint matter more.

Distance operators

All vector search queries use a distance operator in the ORDER BY clause:

Operator Distance metric Operator class
<-> Euclidean (L2) distance vector_l2_ops
<=> Cosine distance vector_cosine_ops
<#> Negative inner product vector_ip_ops
<#> returns the negative inner product because PostgreSQL index scans only support ascending order. To get the actual inner product value, multiply by -1:
SELECT (embedding <#> '[3,1,2]') * -1 AS inner_product FROM items;

Each index requires an operator class that matches the distance metric. Specify the operator class when creating the index:

-- Euclidean distance
CREATE INDEX ON vecs USING hnsw(embedding vector_l2_ops);

-- Cosine distance
CREATE INDEX ON vecs USING hnsw(embedding vector_cosine_ops);

-- Inner product
CREATE INDEX ON vecs USING hnsw(embedding vector_ip_ops);

Tune index parameters

Tune HNSW parameters

HNSW has two build-time parameters and one query-time parameter:

Parameter Default Range Effect
m 16 2–100 Max connections per layer. Higher values improve recall but increase index build time and memory usage. Start with 12–48 for most workloads.
ef_construction 64 4–100 Candidate list size during index build. Higher values improve recall but slow down index building. Must be at least twice m.
hnsw.ef_search Candidate list size during queries. Set this to at least the number of results you want (LIMIT value). Higher values improve recall at the cost of query speed.
-- Create an HNSW index with custom parameters
CREATE TABLE vecs (id int PRIMARY KEY, embedding vector(1536));
CREATE INDEX ON vecs USING hnsw(embedding vector_l2_ops) WITH (m=16, ef_construction=64);

-- Increase ef_search at query time to improve recall
SET hnsw.ef_search = 100;
SELECT * FROM vecs ORDER BY embedding <-> '[...]' LIMIT 10;

To tune HNSW indexes:

  1. Start with the default values (m=16, ef_construction=64).

  2. If recall is below target, increase ef_construction first.

  3. Then adjust m upward. Values in the range 12–48 cover most use cases.

  4. At query time, increase hnsw.ef_search to improve recall without rebuilding the index.

Tune IVFFlat parameters

IVFFlat has one build-time parameter (lists) and one query-time parameter (ivfflat.probes):

Parameter Effect
lists Number of inverted lists (clusters). More lists improve query performance but reduce recall if probes is not increased proportionally.
ivfflat.probes Number of lists searched at query time. Higher values improve recall but slow down queries.

Set lists based on your table size:

  • Up to 1 million rows: lists = rows / 1000

  • Over 1 million rows: lists = sqrt(rows)

Set ivfflat.probes at query time:

  • Start with probes = sqrt(lists) as a baseline.

  • Increase probes to improve recall. Setting probes = lists is equivalent to a full sequential scan.

-- Create an IVFFlat index for a 500,000-row table (lists = 500)
CREATE INDEX ON vecs USING ivfflat(embedding vector_l2_ops) WITH (lists = 500);

-- Set probes at query time
SET ivfflat.probes = 22; -- sqrt(500) = 22
SELECT * FROM vecs ORDER BY embedding <-> '[...]' LIMIT 10;
Important

Build the index after loading your data. IVFFlat uses the existing data to determine cluster centroids. An index built on an empty or sparse table produces inaccurate clusters, which significantly lowers recall.

For more details on index parameters, see the pgvector README.

What's next