FalconSeek is Alibaba Cloud's C++-based vector engine integrated into Elasticsearch (ES). It powers image search, semantic search, and recommendation systems at the scale of Taobao, Tmall, and Pailitao. By setting index_options.type to havenask_native, you can use FalconSeek's high-performance vector index in any Alibaba Cloud ES index while staying fully compatible with the open-source k-nearest neighbors (k-NN) search API.
This guide covers algorithm selection, index configuration, filtering strategies, and runtime tuning.
Use cases
-
Semantic search: Match documents by meaning rather than exact keywords.
-
Image search: Find visually similar images or products based on embedding vectors.
-
Recommendation systems: Surface related items based on vector similarity of user or item embeddings.
Prerequisites
Before you begin, ensure that you have:
-
An Alibaba Cloud Elasticsearch cluster with FalconSeek enabled
-
Pre-generated vector embeddings for your documents (the embedding model determines the
dimsvalue) -
Query vectors generated by the same model as the document vectors
Choose an algorithm
Choosing the right knn_type is the most impactful configuration decision. The table below compares all five algorithms across the dimensions most relevant to production use.
| Algorithm | Recall | Speed | Memory | Recommended data scale |
|---|---|---|---|---|
| HNSW | High | Fast | Medium | 100K–10M documents |
| RabitQGraph | High | Fastest | Lowest | >10M documents |
| QGraph | High | Fast | Low | >5M documents |
| QC | Medium | Medium | Low | >1M documents |
| Linear | 100% | Slow | Low | <1,000 documents (auto-selected) |
Limitations:
-
RabitQGraph: Only supports
l2_normsimilarity. Vector dimensions must be a positive integer multiple of 64. -
Linear: Query time grows linearly with data volume. Suitable only for small datasets.
-
QC: Longer build time than graph-based algorithms.
Decision guide:
-
Start here: Use HNSW. It delivers the best balance of recall, speed, and memory for datasets from 100K to 10M documents.
-
Growing dataset, memory-sensitive: Migrate from HNSW to QGraph and set a
quantizerto reduce memory usage. -
Growing dataset, latency-sensitive: Migrate from HNSW to RabitQGraph if your dimensions are a multiple of 64 and you only need
l2_normsimilarity. -
Very large datasets (>10M documents): Choose RabitQGraph for minimum latency, or QGraph for cost-effective storage.
-
Small dataset (<1,000 documents): Linear (brute-force) search is used automatically. No algorithm selection needed.
Quick start
The following example creates an index named my_falcon_seek_index with a 128-dimension vector field using the HNSW algorithm, writes two documents, and runs a k-NN search.
Step 1: Create an index
PUT /my_falcon_seek_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"product_vector": {
"type": "dense_vector",
"dims": 128,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 32,
"ef_construction": 400
}
},
"category": {
"type": "keyword"
}
}
}
}
Key `dense_vector` field properties:
| Property | Description |
|---|---|
dims |
Vector dimension. Must match the dimension of your embedding model. |
similarity |
Distance function for computing vector similarity. See Similarity functions. |
index_options |
Algorithm and build parameters for the vector index. |
Step 2: Write documents
The product_vector array must have exactly 128 elements, matching dims.
POST /my_falcon_seek_index/_doc/1
{
"product_vector": [0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
-0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21,
0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12,
0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30,
-0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23,
0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
-0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18,
0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05,
0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
-0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22],
"category": "clothes"
}
Step 3: Run a k-NN search
The following query returns the 5 most similar documents to the given query vector. num_candidates controls the size of the candidate set searched on each shard—a larger value increases recall at the cost of higher latency.
GET /my_falcon_seek_index/_search
{
"knn": {
"field": "product_vector",
"query_vector": [0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
-0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12,
0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
-0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
-0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
-0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12],
"k": 5,
"num_candidates": 100
}
}
Step 4: Add a filter
To narrow results to a specific category, add a filter clause. This applies pre-filtering during the approximate k-NN search, so the response always returns up to k matching results.
GET /my_falcon_seek_index/_search
{
"knn": {
"field": "product_vector",
"query_vector": [
0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
-0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12,
0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
-0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
-0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
-0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12
],
"k": 5,
"num_candidates": 100,
"filter": {
"term": {
"category": "shoes"
}
}
}
}
Similarity functions
The similarity function determines how the engine measures closeness between two vectors. Choose based on how your embedding model represents similarity.
| Function | How it works | Best for |
|---|---|---|
l2_norm |
Euclidean distance—smaller values indicate greater similarity | General use: image recognition, facial recognition |
cosine |
Cosine similarity—values closer to 1 indicate greater similarity; not affected by vector length | Text semantic similarity |
dot_product |
Dot product—larger values indicate greater similarity | Recommendation systems where vector magnitude matters |
max_inner_product |
Same as dot_product, but does not require normalized vectors |
Recommendation systems without vector normalization |
RabitQGraph only supports l2_norm. Configuring any other similarity function with RabitQGraph results in an error.
Index configuration
All vector index settings go inside the index_options block of the field mapping.
PUT /<your_index_name>
{
"mappings": {
"properties": {
"<your_vector_field>": {
"type": "dense_vector",
"dims": 768,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 32,
"ef_construction": 400,
"thread_count": 8
}
}
}
}
}
General parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
type |
String | Yes | — | Must be "havenask_native" to activate FalconSeek. |
knn_type |
String | No | "HNSW" |
Vector index algorithm. Options: HNSW, RabitQGraph, QGraph, QC, Linear. |
Build parameters
These parameters take effect during index construction and determine index structure and quality. They cannot be changed without reindexing.
HNSW and QGraph
| Parameter | Type | Default | Impact and recommendations |
|---|---|---|---|
m |
Integer | 16 |
Maximum neighbors per node in the graph. Range: 4–128. A larger value increases recall and memory usage. Use 16 for low memory, 32 for balanced, 64–128 for high recall. |
ef_construction |
Integer | 200 |
Width of the candidate search during graph construction. Range: 10–2000. A larger value improves index quality and recall but increases build time. Use 200 for fast builds, 400–500 for balanced, 800+ for high-quality builds. |
QGraph only
| Parameter | Type | Default | Options |
|---|---|---|---|
quantizer |
String | None | "int8" (recommended, 8x compression), "int4" (higher compression), "fp16" (high precision, 2x compression), "2bit" (extreme compression) |
Advanced parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
thread_count |
Integer | 1 |
Threads used during index construction. Set to 0 to use all available CPU cores, or specify a value from 1 to 32. |
tags |
Array | [] |
Low-cardinality keyword fields to enable tags_filter pre-filtering. Example: ["category", "brand_id"]. |
linear_build_threshold |
Integer | 0 |
If the document count falls below this value, the engine uses Linear (brute-force) search instead of building the configured index. Set to 1000 or 5000 for indexes that may start small. |
index_params |
Object | {} |
Direct access to underlying engine parameters for advanced tuning. Parameters here override top-level parameters with the same name. |
index_params parameters override top-level parameters. For example, if you set "m": 32 at the top level and "proxima.hnsw.builder.max_neighbor_count": 48 inside index_params, the effective value is 48. Use index_params only when you need to tune parameters not exposed at the top level.
"index_options": {
"knn_type": "HNSW",
"m": 32,
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 48
}
}
knn query parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
field |
String | Yes | The dense_vector field to search. |
query_vector |
Array | Yes | The query vector. Dimension must match the field's dims. |
k |
Integer | Yes | Number of nearest neighbors to return. |
num_candidates |
Integer | Recommended | Candidate set size per shard. Must be greater than k. A larger value increases recall but raises query latency. |
tags_filter |
String | No | High-performance pre-filter for HNSW and QGraph. See tags_filter pre-filtering. |
search_params |
Object | No | Runtime search parameter overrides. See Tuning search parameters at query time. |
tags_filter pre-filtering
tags_filter is a pre-filtering mechanism optimized for HNSW and QGraph. Unlike the standard filter clause, tags_filter excludes non-matching nodes early during graph traversal, before distance computations. This makes it significantly faster for fields with a limited number of distinct values, such as category or brand ID.
When to use `tags_filter`:
-
The filter field has low cardinality (a small number of unique values).
-
Query performance is critical and the filter field is known at index creation time.
When to use standard `filter`:
-
The filter condition involves high-cardinality fields or complex expressions.
-
The algorithm is not HNSW or QGraph (that is,
tags_filteris optimized for HNSW and QGraph).
Enable tags_filter
Step 1: Declare the filter fields in index_options using the tags parameter.
PUT /my_vector_index_with_tags
{
"mappings": {
"properties": {
"product_vector": {
"type": "dense_vector",
"dims": 128,
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"tags": ["category"]
}
},
"category": {
"type": "keyword"
}
}
}
}
Step 2: Use tags_filter in the knn query. The syntax is "field_name = value". Use | for OR and & for AND.
GET /my_vector_index_with_tags/_search
{
"knn": {
"field": "product_vector",
"query_vector": [...],
"k": 5,
"tags_filter": "category = shoes | category = socks"
}
}
Tuning search parameters at query time
Without rebuilding the index, temporarily adjust search parameters per query using search_params. This is useful when different workloads need different precision-latency trade-offs—for example, using a lower ef for real-time queries and a higher ef for batch analytics.
HNSW and QGraph
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"proxima.hnsw.searcher.ef": "500",
"proxima.hnsw.searcher.max_scan_ratio": "0.2"
}
}
}
QC
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"proxima.qc.searcher.scan_ratio": "0.05"
}
}
}
RabitQGraph
GET vector_index/_search
{
"knn": {
"field": "vector",
"query_vector": [0.1, 0.2, 0.3],
"k": 10,
"num_candidates": 100,
"search_params": {
"param.rabitQGraph.searcher.ef": "400",
"param.rabitQGraph.searcher.max_scan_ratio": "0.1"
}
}
}
Parameter reference
HNSW
HNSW parameters use the proxima.hnsw. namespace. Set them via index_params.
Builder parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proxima.hnsw.builder.max_neighbor_count |
uint32 | 100 | Neighbors per node in the graph. A larger value improves accuracy but increases computation and storage. Maximum: 65535. Generally should not exceed the vector dimension. |
proxima.hnsw.builder.efconstruction |
uint32 | 500 | Graph construction precision. A larger value produces a more accurate graph but takes longer. |
proxima.hnsw.builder.thread_count |
uint32 | 0 | Construction threads. 0 uses all CPU cores. |
proxima.hnsw.builder.memory_quota |
uint64 | 0 | Maximum memory for construction in bytes. The build fails if this limit is exceeded. Disk-based construction is not supported. |
proxima.hnsw.builder.scaling_factor |
uint32 | 50 | Node ratio between graph layers. Range: [5, 1000]. Rarely needs adjustment. |
proxima.hnsw.builder.neighbor_prune_ratio |
float | 0.5 | Controls when edge pruning begins in the neighbor table. Rarely needs adjustment. |
proxima.hnsw.builder.upper_neighbor_ratio |
float | 0.5 | Upper-layer neighbor count relative to layer 0. Rarely needs adjustment. |
proxima.hnsw.builder.enable_adsampling |
bool | false | Accelerated distance sampling. Supports only Euclidean distance on fp32 data. Not recommended for dimensions below 256. |
proxima.hnsw.builder.slack_pruning_factor |
float | 1.0 | Controls pruning aggressiveness. Recommended range: [1.1, 1.2]. Use 1.1 for gist960 and sift128 datasets. |
Searcher parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proxima.hnsw.searcher.ef |
uint32 | 500 | Candidate set size during search. A larger value increases recall but raises latency. |
proxima.hnsw.searcher.max_scan_ratio |
float | 0.1 | Maximum fraction of documents to scan. The search may stop before this ratio if the ef candidate set converges early. |
proxima.hnsw.searcher.neighbors_in_memory_enable |
bool | false | Keeps the neighbor table in memory for faster search. Increases memory usage. |
proxima.hnsw.searcher.check_crc_enable |
bool | false | Runs a cyclic redundancy check (CRC) on the index at load time. Increases load time when enabled. |
proxima.hnsw.searcher.visit_bloomfilter_enable |
bool | false | Uses a bloom filter to deduplicate visited graph nodes. Reduces memory but slightly degrades performance. |
proxima.hnsw.searcher.visit_bloomfilter_negative_prob |
float | 0.001 | Bloom filter false-positive rate. A smaller value is more accurate but uses more memory. |
proxima.hnsw.searcher.brute_force_threshold |
int | 1000 | If the total document count is below this value, the engine uses linear search instead of HNSW graph traversal. |
Configuration examples
PUT /hnsw_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW"
}
}
}
}
}PUT /hnsw_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 48,
"ef_construction": 500,
"thread_count": 8,
"linear_build_threshold": 1000,
"is_embedding_saved": true,
"embedding_load_strategy": "ANN_INDEX_FILE",
"index_load_strategy": "MEM"
}
}
}
}
}
RabitQGraph
RabitQGraph only supports l2_norm similarity. Vector dimensions must be a positive integer multiple of 64.
RabitQGraph parameters use the param.rabitQGraph. namespace.
Builder parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
param.rabitQGraph.builder.neighbor_cnt |
uint32 | 128 | Neighbors per node. Affects graph connectivity and search precision. |
param.rabitQGraph.builder.ef_construction |
uint32 | 512 | Candidate node count during construction. |
param.rabitQGraph.builder.prune_ratio |
float | 0.5 | Neighbor pruning ratio for graph optimization. |
param.rabitQGraph.builder.cluster_count |
uint32 | 64 | Number of cluster centroids for vector quantization. |
param.rabitQGraph.builder.quantized_bit_count |
uint32 | 1 | Quantization bit depth. Allowed values: 1, 4, 5, 8, or 9. |
param.rabitQGraph.builder.slack_prune_factor |
float | 1.0 | Pruning policy factor. |
param.rabitQGraph.builder.repair_connectivity |
bool | true | Repairs graph connectivity after construction. |
param.rabitQGraph.builder.thread_count |
uint32 | 0 | Construction threads. 0 uses all CPU cores. |
param.rabitQGraph.builder.ckpt_count |
uint32 | 0 | Number of checkpoints for incremental builds. |
param.rabitQGraph.builder.ckpt_threshold |
uint32 | 2000000 | Document count threshold that triggers a checkpoint. |
Searcher parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
param.rabitQGraph.searcher.ef |
uint32 | 250 | Candidate set size during search. A larger value increases recall but raises latency. |
param.rabitQGraph.searcher.max_scan_ratio |
double | 0.05 | Maximum fraction of nodes to scan. |
param.rabitQGraph.searcher.check_crc_enable |
bool | false | Enables CRC check at load time. |
param.rabitQGraph.searcher.thread_count |
uint32 | 1 | Search threads. |
param.rabitQGraph.searcher.thread_safe_filter |
bool | false | Enables thread-safe filtering. |
Configuration examples
{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 256,
"param.rabitQGraph.builder.ef_construction": 512,
"param.rabitQGraph.builder.quantized_bit_count": 8,
"param.rabitQGraph.builder.cluster_count": 128,
"param.rabitQGraph.builder.thread_count": 8,
"param.rabitQGraph.searcher.ef": 300,
"param.rabitQGraph.searcher.max_scan_ratio": 0.1
}
}{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 64,
"param.rabitQGraph.builder.ef_construction": 200,
"param.rabitQGraph.builder.quantized_bit_count": 1,
"param.rabitQGraph.builder.cluster_count": 32,
"param.rabitQGraph.searcher.ef": 150,
"param.rabitQGraph.searcher.max_scan_ratio": 0.03
}
}{
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 128,
"param.rabitQGraph.builder.ef_construction": 400,
"param.rabitQGraph.builder.quantized_bit_count": 4,
"param.rabitQGraph.builder.cluster_count": 64,
"param.rabitQGraph.searcher.ef": 250
}
}
QGraph
QGraph (Quantized Graph) inherits all HNSW builder and searcher parameters and adds the following quantization parameters. Use quantizer at the top level for simple configuration, or use index_params for fine-grained control.
Additional builder parameters
| Parameter | Type | Default | Options |
|---|---|---|---|
proxima.qgraph.builder.quantizer_class |
String | — | Int8QuantizerConverter, Int4QuantizerConverter, HalfFloatConverter, DoubleBitConverter |
proxima.qgraph.builder.quantizer_params |
Object | — | Quantizer-specific configuration. |
Configuration examples
{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 32,
"proxima.hnsw.builder.efconstruction": 400,
"proxima.hnsw.builder.thread_count": 4,
"proxima.qgraph.builder.quantizer_class": "Int8QuantizerConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 300
}
}{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 48,
"proxima.hnsw.builder.efconstruction": 500,
"proxima.hnsw.builder.thread_count": 6,
"proxima.qgraph.builder.quantizer_class": "Int4QuantizerConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 400,
"proxima.hnsw.searcher.max_scan_ratio": 0.1
}
}{
"index_params": {
"proxima.hnsw.builder.max_neighbor_count": 64,
"proxima.hnsw.builder.efconstruction": 600,
"proxima.hnsw.builder.thread_count": 8,
"proxima.qgraph.builder.quantizer_class": "HalfFloatConverter",
"proxima.qgraph.builder.quantizer_params": {},
"proxima.hnsw.searcher.ef": 500
}
}
QC
QC (Quantization Clustering) uses a clustering-based index. Parameters use the proxima.qc. namespace. Build time is longer than graph-based algorithms, but runtime memory usage is lower.
Builder parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proxima.qc.builder.train_sample_count |
uint32 | 0 | Training data size. 0 uses all documents. |
proxima.qc.builder.thread_count |
uint32 | 0 | Construction threads. 0 uses all CPU cores. |
proxima.qc.builder.centroid_count |
String | — | Number of cluster centroids. Supports hierarchical clustering using * as a separator, for example "100*100". |
proxima.qc.builder.cluster_class |
String | OptKmeansCluster |
Clustering method. |
proxima.qc.builder.cluster_auto_tuning |
bool | false | Automatically tunes the centroid count. |
proxima.qc.builder.optimizer_class |
String | HcBuilder |
Centroid optimizer for improving classification precision. |
proxima.qc.builder.optimizer_params |
IndexParams | — | Build and retrieval parameters for the optimizer. |
proxima.qc.builder.converter_class |
String | — | Automatically applies MIPS transformation when the similarity metric is inner product. |
proxima.qc.builder.converter_params |
IndexParams | — | Initialization parameters for converter_class. |
proxima.qc.builder.quantizer_class |
String | — | Quantizer type. Options: Int8QuantizerConverter, Int4QuantizerConverter, and others. |
proxima.qc.builder.quantizer_params |
IndexParams | — | Quantizer configuration. |
proxima.qc.builder.quantize_by_centroid |
bool | false | Quantizes vectors by centroid when using quantizer_class. |
proxima.qc.builder.store_original_features |
bool | false | Stores original (unquantized) vectors alongside the index. |
Searcher parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proxima.qc.searcher.scan_ratio |
float | 0.01 | Used to calculate max_scan_num: total doc count × scan_ratio. |
proxima.qc.searcher.optimizer_params |
IndexParams | — | Runtime retrieval parameters for the optimizer used during the build. |
proxima.qc.searcher.brute_force_threshold |
int | 1000 | If the total document count is below this value, the engine uses linear search. |
Configuration examples
{
"index_params": {
"proxima.qc.builder.thread_count": 4,
"proxima.qc.builder.centroid_count": "1000",
"proxima.qc.builder.cluster_class": "OptKmeansCluster",
"proxima.qc.searcher.scan_ratio": 0.02
}
}{
"index_params": {
"proxima.qc.builder.thread_count": 8,
"proxima.qc.builder.centroid_count": "100*100",
"proxima.qc.builder.optimizer_class": "HnswBuilder",
"proxima.qc.builder.quantizer_class": "Int8QuantizerConverter",
"proxima.qc.searcher.scan_ratio": 0.01
}
}{
"index_params": {
"proxima.qc.builder.thread_count": 12,
"proxima.qc.builder.centroid_count": "2000",
"proxima.qc.builder.train_sample_count": 100000,
"proxima.qc.builder.store_original_features": true,
"proxima.qc.searcher.scan_ratio": 0.05
}
}
Linear
Linear performs a brute-force search across all vectors. It delivers 100% recall but query time grows linearly with document count. The engine selects Linear automatically when the document count is below brute_force_threshold (default 1,000) or linear_build_threshold.
Parameters use the proxima.linear. namespace.
Builder and searcher parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proxima.linear.builder.column_major_order |
String | "false" |
Storage order for vectors. "false" = row-major, "true" = column-major. |
proxima.linear.searcher.read_block_size |
uint32 | 1048576 | Block size (in bytes) read into memory per search iteration. The recommended value is 1 MB (1048576). |
Configuration examples
{
"index_params": {
"proxima.linear.builder.column_major_order": "false",
"proxima.linear.searcher.read_block_size": 1048576
}
}{
"index_params": {
"proxima.linear.builder.column_major_order": "true",
"proxima.linear.searcher.read_block_size": 2097152
}
}
Appendix: Complete index configuration examples
HNSW
PUT /hnsw_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW"
}
}
}
}
}PUT /hnsw_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "HNSW",
"m": 48,
"ef_construction": 500,
"thread_count": 8,
"linear_build_threshold": 1000,
"is_embedding_saved": true,
"embedding_load_strategy": "ANN_INDEX_FILE",
"index_load_strategy": "MEM"
}
}
}
}
}
Linear
PUT /linear_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 384,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "Linear"
}
}
}
}
}
QC
PUT /qc_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QC"
}
}
}
}
}PUT /qc_custom
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "dot_product",
"index_options": {
"type": "havenask_native",
"knn_type": "QC",
"thread_count": 8,
"linear_build_threshold": 5000,
"index_params": "{\"proxima.qc.builder.thread_count\": 8, \"proxima.qc.builder.centroid_count\": \"2000\"}"
}
}
}
}
}
QGraph
PUT /qgraph_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"quantizer": "int8"
}
}
}
}
}PUT /qgraph_high_precision
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1536,
"index": true,
"similarity": "cosine",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"m": 40,
"ef_construction": 600,
"thread_count": 8,
"quantizer": "fp16",
"is_embedding_saved": true,
"index_load_strategy": "MEM"
}
}
}
}
}PUT /qgraph_memory_optimized
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "QGraph",
"m": 24,
"ef_construction": 300,
"thread_count": 6,
"quantizer": "int4",
"is_embedding_saved": false,
"index_load_strategy": "BUFFER"
}
}
}
}
}
RabitQGraph
PUT /rabitqgraph_basic
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 64,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph"
}
}
}
}
}PUT /rabitqgraph_performance
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 128,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"thread_count": 8,
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 256,
"param.rabitQGraph.builder.ef_construction": 512,
"param.rabitQGraph.builder.quantized_bit_count": 4,
"param.rabitQGraph.builder.cluster_count": 128,
"param.rabitQGraph.searcher.ef": 300
}
}
}
}
}
}PUT /rabitqgraph_memory_optimized
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 192,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"thread_count": 4,
"linear_build_threshold": 1000,
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 64,
"param.rabitQGraph.builder.ef_construction": 200,
"param.rabitQGraph.builder.quantized_bit_count": 1,
"param.rabitQGraph.builder.cluster_count": 32,
"param.rabitQGraph.searcher.ef": 150,
"param.rabitQGraph.searcher.max_scan_ratio": 0.05
}
}
}
}
}
}PUT /rabitqgraph_with_tags
{
"mappings": {
"properties": {
"vector": {
"type": "dense_vector",
"dims": 256,
"index": true,
"similarity": "l2_norm",
"index_options": {
"type": "havenask_native",
"knn_type": "RabitQGraph",
"tags": ["category", "region"],
"index_params": {
"param.rabitQGraph.builder.neighbor_cnt": 128,
"param.rabitQGraph.builder.ef_construction": 400,
"param.rabitQGraph.builder.quantized_bit_count": 8,
"param.rabitQGraph.searcher.ef": 250
}
}
},
"category": {
"type": "keyword"
},
"region": {
"type": "keyword"
}
}
}
}