ApsaraDB RDS for MySQL natively stores and queries floating-point vectors with up to 16,383 dimensions, using HNSW indexes for high-performance approximate nearest neighbor search. Query vector data with standard SQL — no separate vector database required.
Prerequisites
Before you begin, ensure that you have:
An ApsaraDB RDS for MySQL instance running MySQL 8.0
Minor engine version
20251031or later. If your instance does not meet this requirement, upgrade the minor engine version or upgrade the major database version
How it works
Enable Vector Storage on your RDS instance from the console.
Create a table with a
VECTOR(n)column and an HNSW index specifying the distance type.Insert vector data using
VEC_FROMTEXT, or batch-load from existing tables.Run similarity queries with
VEC_DISTANCEandLIMIT— the HNSW index accelerates retrieval automatically.
The HNSW index uses single instruction multiple data (SIMD) hardware acceleration, Bloom filter search pruning, and LIMIT condition pushdown to speed up large-scale vector retrieval.
Vector Storage is fully compatible with the MySQL protocol and supports Java Database Connectivity (JDBC)/Object-Relational Mapping (ORM) tools and mainstream developer frameworks. It integrates with Data Transmission Service (DTS) for data synchronization and Data Management (DMS) for instance management, providing full lifecycle capabilities including data synchronization, management, backup, and recovery. Existing instances can be upgraded with one click — no new cluster needed.
Key concepts
VECTOR data type — Stores floating-point vectors of up to 16,383 dimensions. Compatible with standard SQL interfaces for read, write, and batch update.
HNSW index — A graph-based approximate nearest neighbor index. The two key parameters that control its behavior are:
M— the maximum number of connections per node in the graph. A higherMvalue improves recall at the cost of more memory and slower index builds.ef_search— the search range during queries. A higheref_searchvalue improves recall at the cost of slower query speed.
Distance types — Two distance metrics are supported:
EUCLIDEAN— straight-line (geometric) distance between vectors in multidimensional space. Smaller distance = more similar.COSINE— cosine of the angle between vectors. Measures directional similarity, ignoring vector length. Smaller distance = more similar.
Enable Vector Storage
Enabling or disabling Vector Storage does not require an instance restart.
Go to the Instances page. In the top navigation bar, select the region where your instance resides, then click the instance ID.
On the Basic Information page, find Vector Storage in the Status section and click Enable.
Wait for the status to change to Enabled.
Create a table and insert vector data
Step 1: Create a table with a vector column and HNSW index
-- Create a table with a 5-dimension vector column and an HNSW index
CREATE TABLE product_embeddings (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
product_name VARCHAR(255),
embedding VECTOR(5) NOT NULL,
-- Specify the M value and distance type when creating the index
VECTOR INDEX idx_embedding(embedding) M=16 DISTANCE=COSINE
);Step 2: Insert vector data
Use VEC_FROMTEXT to convert string representations into vectors.
-- Insert three product embeddings
INSERT INTO product_embeddings (product_name, embedding) VALUES
('product_A', VEC_FROMTEXT('[0.1, 0.2, 0.3, 0.4, 0.5]')),
('product_B', VEC_FROMTEXT('[0.6, 0.7, 0.8, 0.9, 1.0]')),
('product_C', VEC_FROMTEXT('[0.11, 0.22, 0.33, 0.44, 0.55]'));Run vector similarity queries
Find the two products most similar to a given vector using cosine distance. A smaller cosine distance means greater similarity.
SELECT
id,
product_name,
VEC_DISTANCE(embedding, VEC_FROMTEXT('[0.1, 0.2, 0.3, 0.4, 0.51]')) AS similarity_score
FROM product_embeddings
ORDER BY similarity_score ASC
LIMIT 2;If one argument to VEC_DISTANCE is an indexed column, the index distance type is applied automatically.
Vector functions
| Function | Description |
|---|---|
VECTOR_DIM | Returns the number of dimensions in a vector |
VEC_FROMTEXT / TO_VECTOR / STRING_TO_VECTOR | Converts a string to a vector |
VEC_TOTEXT / FROM_VECTOR / VECTOR_TO_STRING | Converts a vector to a string |
VEC_DISTANCE | Calculates the distance between two vectors. If one argument is an indexed column, the index distance type is applied automatically. |
VEC_DISTANCE_EUCLIDEAN | Calculates Euclidean distance between two vectors |
VEC_DISTANCE_COSINE | Calculates cosine distance between two vectors |
Manage parameters
All vector-related parameters are dynamic — changes take effect immediately without an instance restart.
Parameter reference
| Parameter | Scope | Type | Default | Range | Description |
|---|---|---|---|---|---|
vidx_default_distance | Session | String | EUCLIDEAN | EUCLIDEAN, COSINE | Default distance type for vector queries. EUCLIDEAN measures straight-line distance; COSINE measures directional similarity. |
vidx_hnsw_default_m | Session | Integer | 6 | [3, 200] | Default M value for HNSW indexes (max connections per node). Higher values improve recall but use more memory. |
vidx_hnsw_ef_search | Session | Integer | 20 | [1, 10000] | Search range for HNSW queries. Higher values improve recall at the cost of query speed. |
vidx_hnsw_cache_size | Global | BigInt | 1048576 | [1048576, 18446744073709551615] | Maximum memory the HNSW index cache can use, in bytes. |
Modify parameters
Go to the Instances page. Select the region and click the instance ID.
In the left navigation pane, click Parameter.
On the Modifiable Parameters tab, find the parameter and set a new value.
Click OK, then Apply Changes. In the dialog that appears, select when the changes take effect.
Limitations
Vector indexes can only be created on InnoDB tables.
The primary key of a table with a vector index cannot exceed 256 bytes.
The
inplacesyntax cannot be used to create, modify, or delete vector indexes.Vector indexes cannot be set to
INVISIBLE.Tables with vector indexes do not support the Recycle Bin feature.
Data modification and queries on vector indexes support only the Read Committed (RC) isolation level.
Because the HNSW algorithm involves random levels and heuristic graph construction, the graph structures on primary and standby instances are not guaranteed to be identical.
If the source database uses the
vectortype in stored procedures or functions, synchronization or migration to a destination that does not support vectors will fail.
Use cases
Semantic search — Store text or image embeddings and retrieve the most semantically similar items using nearest neighbor queries.
AI-powered recommendation — Find products, articles, or content similar to a user's past interactions based on vector distance.
Multi-modal analysis — Combine vector similarity search with SQL filters to support mixed structured and unstructured data queries.