RDS for MySQL provides deeply integrated, enterprise-grade vector data processing. It natively supports the storage and computation of vector data with up to 16,383 dimensions, integrates mainstream vector functions, and uses a highly optimized Hierarchical Navigable Small World (HNSW) algorithm to provide efficient nearest neighbor search capabilities. You can create an index on a full-dimension vector column.
Features
RDS for MySQL natively supports vector data processing, including vector storage, similarity calculation, and high-performance index creation. It provides an out-of-the-box vector solution for scenarios such as large-scale semantic search, intelligent recommendations, and multimodal analysis. Use standard SQL interfaces to seamlessly integrate high-precision vector matching with complex business logic. This integration lets you quickly build and deploy innovative AI applications on a low-cost, highly compatible architecture.
-
Efficient storage, access, and computation of high-dimensional vectors: Supports storing floating-point vector data with up to 16,383 dimensions and introduces the
VECTORdata type. It supports standard SQL interfaces, allowing you to directly write, update, and manage vectorized data in batches. The following table describes the supported vector processing functions.Function name
Description
VECTOR_DIMReturns the number of dimensions in a vector.
VEC_FROMTEXTConverts a string to a vector.
TO_VECTORSTRING_TO_VECTORVEC_TOTEXTConverts a vector to a string.
FROM_VECTORVECTOR_TO_STRINGVEC_DISTANCECalculates the distance between two vectors. If one of the operands is an indexed column, the function automatically detects the distance type of the index.
VEC_DISTANCE_EUCLIDEANVEC_DISTANCE_COSINE -
High-performance vector index: The vector index uses a deeply optimized HNSW (Hierarchical Navigable Small World) algorithm. It uses techniques such as SIMD hardware acceleration, Bloom filter search pruning, and LIMIT condition pushdown to significantly improve retrieval efficiency for large-scale vector data. It also supports hybrid storage and joint queries of both vector data and scalar data.
-
Open-source ecosystem and out-of-the-box usability: The feature is fully compatible with the MySQL protocol and supports JDBC/ORM tools and mainstream development frameworks. It integrates with Alibaba Cloud services such as DTS and DMS, providing full-lifecycle capabilities, including data synchronization, management, backup, and recovery. You can upgrade existing instances with a single click without creating a new cluster.
Applicability
-
Database version: MySQL 8.0 (minor engine version 20251031 or later). If your instance does not meet the version requirements, you can upgrade the minor engine version or upgrade the major engine version.
-
This feature has the following limitations:
-
You can create a vector index only on a table that uses the InnoDB engine.
-
The primary key of the table cannot exceed 256 bytes in length.
-
You cannot use the
inplacesyntax to create, modify, or delete a vector index. -
A vector index cannot be set to
INVISIBLE. -
You cannot use the Recycle Bin feature on tables that contain a vector index.
-
Data modification and queries on a vector index support only the Read Committed (RC) isolation level.
-
Due to the randomness of the HNSW algorithm, such as random level assignment and heuristic algorithms, the graph structure of the vector index on the primary/standby instances is not guaranteed to be identical.
-
If a stored procedure or function in the source database uses the
vectortype, synchronization or migration to a destination database that does not support vectors will fail.
-
Parameter management
Parameters
|
Parameter |
Description |
|
|
• Description: The default vector distance type.
|
|
|
• Description: The default M value for an HNSW index (the maximum number of outgoing edges for each node in the graph). |
|
|
• Description: The default ef_search value for an HNSW index query (the search scope). |
|
|
• Description: The maximum memory that the HNSW index cache can use, in bytes. |
Modify parameters
Go to the Instances page. In the top navigation bar, select the region in which the RDS instance resides. Then, find the RDS instance and click the ID of the instance.
-
In the left-side navigation pane, click Parameter Settings.
-
On the Editable Parameters tab, search for the parameter that you want to modify and set its value.
-
Click OK, and then click Submit Parameters. In the dialog box that appears, select when you want the changes to take effect.
All vector-related parameters are dynamic. Modifications take effect immediately without restarting the instance.
Enable and use vector storage
You can enable or disable the vector feature without restarting the instance.
Step 1: Enable vector support
-
Go to the RDS console, select the target region, and then click the instance ID.
-
On the Basic Information page, in the Running Status section, find Vector Storage and click Enable.
-
When the status changes to Enabled, the feature takes effect immediately.
Step 2: Create a table and a vector index
-- Create a table with a 5-dimension vector column and an HNSW index
CREATE TABLE product_embeddings (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
product_name VARCHAR(255),
embedding VECTOR(5) NOT NULL,
-- Create a vector index, and specify M (graph connectivity) and the distance metric
VECTOR INDEX idx_embedding(embedding) M=16 DISTANCE=COSINE
);
Step 3: Insert data
-- Use the VEC_FROMTEXT function to insert vector data
INSERT INTO product_embeddings (product_name, embedding) VALUES
('product_A', VEC_FROMTEXT('[0.1, 0.2, 0.3, 0.4, 0.5]')),
('product_B', VEC_FROMTEXT('[0.6, 0.7, 0.8, 0.9, 1.0]')),
('product_C', VEC_FROMTEXT('[0.11, 0.22, 0.33, 0.44, 0.55]'));
Step 4: Perform a vector similarity search
-- Find the two products that are most similar to the given vector '[0.1, 0.2, 0.3, 0.4, 0.51]'
SELECT
id,
product_name,
VEC_DISTANCE(embedding, VEC_FROMTEXT('[0.1, 0.2, 0.3, 0.4, 0.51]')) AS similarity_score
FROM
product_embeddings
ORDER BY
similarity_score ASC -- The smaller the cosine distance, the more similar the vectors are
LIMIT 2;