Partitions

更新时间:
复制 MD 格式

A partition physically or logically divides a collection into subsets of vectors. When you target a specific partition in operations such as insert or query, DashVector scopes the operation to that partition only, reducing the search space and improving query performance.

Key concepts

ConceptDescription
Partition namingEach partition has a unique name within its collection
Shared schemaAll partitions in a collection share the same vector dimensions, vector data types, distance metrics, and field definitions
Default partitionEvery collection includes a default partition that cannot be deleted. Operations without a specified partition target this default
Partition limitA collection supports up to a fixed number of partitions. For the exact limit, see Limits
ManagementCreate and delete partitions through API operations

When to use partitions

Partitions work best when your data has a clear categorical dimension that aligns with how you query it. Each query targets one partition, so the partitioning field should match the most common query filter.

Good candidates for partitioning:

Use casePartitioning fieldWhy it works
E-commerce image searchProduct category (shoes, skirts, pants)Users search within a known category, so each query hits one partition
Video surveillanceDate (one partition per day)Data has a fixed 30-day retention window; create partitions daily and delete them when they expire
Trademark detectionTrademark structure (text, graphic, numeric, letter)Queries target a specific trademark type
Multilingual Q&ALanguage (Chinese, English, French)Queries match the user's language
Multi-tenant SaaSCustomer IDProvides physical data isolation between tenants at low cost

When partitions are not the right fit:

  • Small datasets: The overhead of managing partitions outweighs the performance gain.

  • No clear partitioning field: If queries frequently span multiple partitions, performance is worse than using a single partition.

Prerequisites

Before you begin, make sure that you have:

Manage partitions

All examples on this page use the following client and collection setup:

import dashvector
import os

# Get credentials from environment variables
client = dashvector.Client(
    api_key=os.environ.get('DASHVECTOR_API_KEY'),
    endpoint=os.environ.get('DASHVECTOR_ENDPOINT')
)

# Create a collection (skip if one already exists)
client.create(name='understand_partition', dimension=4)
collection = client.get('understand_partition')
PlaceholderDescription
DASHVECTOR_API_KEYYour DashVector API key
DASHVECTOR_ENDPOINTYour cluster endpoint URL

Create a partition

collection.create_partition(name='shoes')

Describe a partition

Retrieve metadata for a specific partition:

ret = collection.describe_partition('shoes')
print(ret)

List partitions

List all partitions in a collection:

partitions = collection.list_partitions()
print(partitions)

Insert documents into a partition

Pass the partition parameter to route documents to a specific partition. Without it, documents go to the default partition.

collection.insert(
    ('1', [0.1, 0.1, 0.1, 0.1]),
    partition='shoes'
)

Query within a partition

Run a vector similarity search scoped to a single partition:

docs = collection.query(
    vector=[0.1, 0.1, 0.2, 0.1],
    partition='shoes'
)
print(docs)

Delete documents from a partition

Remove specific documents from a partition by ID:

collection.delete(ids=['1'], partition='shoes')

Get partition statistics

Check the document count and other metrics for a partition:

ret = collection.stats_partition('shoes')
print(ret)

Delete a partition

Delete a partition and all documents it contains. The default partition cannot be deleted.

collection.delete_partition('shoes')

Use cases

E-commerce image search

A cross-border e-commerce platform stores 20 million clothing product images. Products fall into predefined categories such as shoes, skirts, and pants. Each category maps to a partition. When a user searches by image, the system identifies the product category (either from user input or a classification model) and queries only the corresponding partition.

E-commerce image search architecture

Video surveillance with time-based partitions

A video surveillance system extracts frames from 1,000 cameras in an industrial park, identifies vehicle features, and imports the vectors into DashVector. Data is retained for 30 days. Create one partition per day and delete partitions as they expire.

Video surveillance architecture

Trademark infringement detection

A trademark agent maintains a database of 50 million trademarks divided into nine structural categories: text, graphic, numeric, letter, and others. Each category corresponds to a partition. When checking for infringement, the agent queries only the relevant category partition, narrowing the search space significantly.

Multilingual knowledge base

An international e-commerce team maintains a knowledge base in Chinese, English, and French. Each language maps to a partition. The system embeds the user's question and queries the partition that matches the detected language, returning answers in the correct language.

Multi-tenant data isolation

An e-commerce SaaS provider offers image search capabilities to multiple small and micro businesses. Each customer maps to a partition within a single collection. This provides physical data isolation between tenants while keeping infrastructure costs low.

What's next