pgvector user guide

更新时间:
复制 MD 格式

ApsaraDB RDS for PostgreSQL supports the pgvector extension, which provides a new data type for storing vectors and enables efficient similarity searches on high-dimensional vectors.

Background

ApsaraDB RDS for PostgreSQL supports the pgvector extension to store vector data and perform vector similarity searches, providing a data foundation for AI-powered applications.

The pgvector extension provides the following key features:

  • Provides the vector data type to store and query vector data.

  • Supports exact and approximate nearest neighbor (ANN) searches. You can calculate similarity by using Euclidean distance (L2), cosine similarity, or inner product. To accelerate queries, you can create an HNSW index or an IVFFlat index. The extension also supports element-wise multiplication of vectors, the L1 distance function, and sum aggregation.

  • Supports vectors with up to 16,000 dimensions and allows you to create indexes for vectors with up to 2,000 dimensions.

Key concepts and how it works

Embedding

An embedding is the process of mapping high-dimensional data to a low-dimensional representation. In machine learning and natural language processing (NLP), embeddings are often used to represent discrete symbols or objects as points in a continuous vector space.

This process reflects the semantic and syntactic relationships between words in the vector space.

Note

For more information, see the official documentation for the following common embedding tools and libraries:

How it works

  1. Embedding converts information such as text, images, and audio into vector data by representing their features across multiple dimensions.

  2. The pgvector extension provides the vector data type to store vector data in ApsaraDB RDS for PostgreSQL.

  3. pgvector can perform exact and approximate nearest neighbor searches on the stored vector data.

For example, to store three objects (apple, banana, cat) in a database and calculate their similarity by using pgvector, follow these steps:

  1. Use an embedding model to convert the objects into vectors. For a two-dimensional embedding, the results might look like this:

    Apple: embedding[1,1]
    Banana: embedding[1.2,0.8]
    Cat: embedding[6,0.4]
  2. Store the resulting vectors in the database. For details about how to store vector data, see the Examples section.

    In a two-dimensional plane, the distribution of the objects is as follows:

    image..png

Because apples and bananas are both fruits, their vectors are closer together in the 2D coordinate system. The cat, being a different type of object, is farther away.

You can further refine the attributes of an object, such as color, origin, and taste for a fruit. Each attribute adds a dimension. More dimensions allow for more detailed classification, which can lead to more precise search results.

Use cases

  • Store vector data.

  • Perform vector similarity searches.

Prerequisites

Your ApsaraDB RDS for PostgreSQL instance must meet the following requirements:

  • The instance runs PostgreSQL 14 or a later version.

  • The minor engine version of the instance is 20230430 or later. For instances that run PostgreSQL 17, the minor engine version must be 20241030 or later.

    Note

    To upgrade the major engine version or update the minor engine version, see Upgrade the major engine version or Update the minor engine version.

  • You have a privileged account for your ApsaraDB RDS for PostgreSQL instance. For more information, see Create an account.

Extension management

RDS console

  • Install the extension

    1. Log on to the ApsaraDB RDS console and go to the Instances page. In the top navigation bar, select the region where your instance is located, and then click the instance ID.

    2. In the left-side navigation pane, click Plug-ins.

    3. On the Extension Marketplace tab, find the vector extension and click Install.

      You can also search for the vector plugin on the Extension Management page, and click Install in the Actions column.

    4. In the dialog box that appears, select the target database and privileged account, and then click Install.

      The extension is successfully installed when the instance status changes from Maintaining Instance to Running.

  • Update or uninstall the extension

    • On the Extension Management page, click the Installed Extensions tab. Find the target extension and click Upgrade Version in the Actions column to upgrade the extension to the latest version.

      Note

      If the Upgrade Version button is not displayed in the Actions column, the extension is already on the latest version.

    • On the Extension Management page, click the Installed Extensions tab. Find the target extension and click Uninstall in the Actions column.

SQL commands

Important

Only a privileged account can run the following commands. For more information about how to create a privileged account, see Create an account.

  • Create the extension

    CREATE EXTENSION IF NOT EXISTS vector;
  • Drop the extension

    DROP EXTENSION vector;
  • Update the extension

    ALTER EXTENSION vector UPDATE [ TO new_version ]
    Note

    new_version specifies the version of pgvector. For information about the latest version and its features, see the official pgvector documentation.

Examples

The following example shows how to use the pgvector extension. For more advanced usage, see the official pgvector documentation.

  1. Use an account with table creation permissions to create a table named items to store embeddings.

    CREATE TABLE items (
      id bigserial PRIMARY KEY, 
      item text, 
      embedding vector(2)
    );
    Note

    In this example, a 2-dimensional vector is used. pgvector supports vectors with up to 16,000 dimensions.

  2. Insert vector data into the table.

    INSERT INTO
      items (item, embedding)
    VALUES
      ('Apple', '[1, 1]'),
      ('Banana', '[1.2, 0.8]'),
      ('Cat', '[6, 0.4]');
  3. Use the cosine similarity operator <=> to calculate the similarity of banana to apple and cat.

    SELECT
      item,
      embedding <=> '[1.2, 0.8]' AS cosine_distance
    FROM
      items
    ORDER BY
      cosine_distance;
    Note
    • In the example above, the <=> operator is used to calculate the cosine distance. The smaller the distance, the higher the similarity.

    • You can also use the Euclidean distance operator <-> or the inner product operator <#> to calculate similarity.

    Sample result:

     item   |  cosine_distance
    --------+----------------------
     Banana |                    0
     Apple  | 0.019419362524530137
     Cat    | 0.13289443670962842

    In the preceding result:

    • The result for Banana is 0, which indicates a perfect match (zero distance).

    • The result for Apple is 0.019, which indicates that Apple is very similar to Banana.

    • The result for Cat is 0.133, which indicates that Cat is not very similar to Banana.

    Note

    In a real-world application, you can set a similarity threshold to filter out results with low similarity.

  4. To improve the performance of similarity searches, create an index on your vector data. The following examples show how to create an index for the embedding column.

    HNSW index

    CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

    Parameters:

    Parameter

    Description

    m

    The maximum number of connections for each node on each layer of the HNSW graph.

    A larger value creates a denser graph, which typically improves the recall rate but increases indexing and query time.

    ef_construction

    The size of the dynamic candidate list during index construction. This parameter defines how many candidate nodes are kept for selecting the optimal connections.

    A larger value can improve the recall rate but increases indexing time.

    IVF index

    CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

    Parameters:

    Parameter/Value

    Description

    items

    The table that contains the column to be indexed.

    embedding

    The vector column to be indexed.

    vector_cosine_ops

    The operator class specified for the vector index.

    • Cosine similarity search uses vector_cosine_ops.

    • Euclidean distance uses vector_l2_ops.

    • For inner product similarity, use vector_ip_ops.

    lists = 100

    The lists parameter specifies the number of lists to partition the dataset into. A larger value means the dataset is divided more, making each subset smaller and index queries faster. However, as the lists value increases, the query recall rate might decrease.

    Note
    • Recall rate is a metric in information retrieval and classification tasks. It is the ratio of correctly retrieved or classified samples to the total number of relevant samples. Recall rate measures the ability of a system to find all relevant samples.

    • Building an index requires a large amount of memory. If the value of the lists parameter exceeds 2000, an error occurs: ERROR: memory required is xxx MB, maintenance_work_mem is xxx MB. You need to set a larger value for maintenance_work_mem to build an index for vector data. However, setting this value too high creates a high risk of OOM for the instance. For more information, see Set instance parameters.

    • You must adjust the lists parameter to balance query speed and recall rate to meet the requirements of your application.

    You can use one of the following methods to set the ivfflat.probes parameter. This parameter specifies the number of lists to search within the index. By increasing the ivfflat.probes value, you search more lists, which can improve the recall rate of your query results.

    • Session level

      SET ivfflat.probes = 10;
    • Transaction level

      BEGIN; SET LOCAL ivfflat.probes = 10; SELECT ... COMMIT;

    A larger value for ivfflat.probes results in a higher query recall rate but slower query speed. Depending on your application needs and dataset characteristics, adjust the lists and ivfflat.probes values to achieve the best balance between query performance and recall rate.

    Note

    If the value of ivfflat.probes is equal to the value of lists that is specified during index creation, the query ignores the vector index and performs a full table scan.

Performance data

When you create an index for vector data, you must balance query speed and recall rate based on your data volume and application scenario. For related performance tests, see the following topics:

Best practices