FalconSeek vector index usage guide-Elasticsearch(ES)-阿里云帮助中心

Use cases

Semantic search: Match documents by meaning rather than exact keywords.
Image search: Find visually similar images or products based on embedding vectors.
Recommendation systems: Surface related items based on vector similarity of user or item embeddings.

Prerequisites

Before you begin, ensure that you have:

An Alibaba Cloud Elasticsearch cluster with FalconSeek enabled
Pre-generated vector embeddings for your documents (the embedding model determines the dims value)
Query vectors generated by the same model as the document vectors

Choose an algorithm

Choosing the right knn_type is the most impactful configuration decision. The table below compares all five algorithms across the dimensions most relevant to production use.

Algorithm	Recall	Speed	Memory	Recommended data scale
HNSW	High	Fast	Medium	100K–10M documents
RabitQGraph	High	Fastest	Lowest	>10M documents
QGraph	High	Fast	Low	>5M documents
QC	Medium	Medium	Low	>1M documents
Linear	100%	Slow	Low	<1,000 documents (auto-selected)

Limitations:

RabitQGraph: Only supports l2_norm similarity. Vector dimensions must be a positive integer multiple of 64.
Linear: Query time grows linearly with data volume. Suitable only for small datasets.
QC: Longer build time than graph-based algorithms.

Decision guide:

Start here: Use HNSW. It delivers the best balance of recall, speed, and memory for datasets from 100K to 10M documents.
Growing dataset, memory-sensitive: Migrate from HNSW to QGraph and set a quantizer to reduce memory usage.
Growing dataset, latency-sensitive: Migrate from HNSW to RabitQGraph if your dimensions are a multiple of 64 and you only need l2_norm similarity.
Very large datasets (>10M documents): Choose RabitQGraph for minimum latency, or QGraph for cost-effective storage.
Small dataset (<1,000 documents): Linear (brute-force) search is used automatically. No algorithm selection needed.

Quick start

The following example creates an index named my_falcon_seek_index with a 128-dimension vector field using the HNSW algorithm, writes two documents, and runs a k-NN search.

Step 1: Create an index

PUT /my_falcon_seek_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "product_vector": {
        "type": "dense_vector",
        "dims": 128,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 32,
          "ef_construction": 400
        }
      },
      "category": {
        "type": "keyword"
      }
    }
  }
}

Key `dense_vector` field properties:

Property	Description
`dims`	Vector dimension. Must match the dimension of your embedding model.
`similarity`	Distance function for computing vector similarity. See Similarity functions.
`index_options`	Algorithm and build parameters for the vector index.

Step 2: Write documents

The product_vector array must have exactly 128 elements, matching dims.

POST /my_falcon_seek_index/_doc/1
{
  "product_vector": [0.12, -0.05, 0.08, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
    -0.03, 0.15, 0.22, -0.11, 0.09, 0.33, -0.07, 0.14, 0.26, -0.21,
    0.18, 0.29, -0.13, 0.06, 0.35, -0.08, 0.16, 0.23, -0.15, 0.12,
    0.27, -0.22, 0.19, 0.32, -0.14, 0.07, 0.25, -0.18, 0.13, 0.30,
    -0.09, 0.17, 0.24, -0.16, 0.10, 0.34, -0.10, 0.20, 0.31, -0.23,
    0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
    0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
    -0.13, 0.07, 0.24, -0.22, 0.19, 0.32, -0.16, 0.10, 0.26, -0.18,
    0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.15, 0.29, -0.11, 0.05,
    0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
    -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
    0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
    0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22],
  "category": "clothes"
}

Step 3: Run a k-NN search

The following query returns the 5 most similar documents to the given query vector. num_candidates controls the size of the candidate set searched on each shard—a larger value increases recall at the cost of higher latency.

GET /my_falcon_seek_index/_search
{
  "knn": {
    "field": "product_vector",
    "query_vector": [0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
      -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
      0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12,
      0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
      -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
      0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
      0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
      -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
      0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
      0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
      -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
      0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
      0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12],
    "k": 5,
    "num_candidates": 100
  }
}

Step 4: Add a filter

To narrow results to a specific category, add a filter clause. This applies pre-filtering during the approximate k-NN search, so the response always returns up to k matching results.

GET /my_falcon_seek_index/_search
{
  "knn": {
    "field": "product_vector",
    "query_vector": [
      0.12, -0.05, 0.01, 0.24, -0.17, 0.31, 0.02, -0.19, 0.11, 0.28,
      -0.03, 0.15, 0.22, -0.11, 0.09, 0.23, -0.07, 0.14, 0.26, -0.21,
      0.18, 0.29, -0.13, 0.06, 0.35, -0.18, 0.16, 0.23, -0.15, 0.12,
      0.27, -0.22, 0.19, 0.32, -0.14, 0.87, 0.25, -0.18, 0.13, 0.30,
      -0.09, 0.17, 0.24, -0.16, 0.10, 0.64, -0.10, 0.20, 0.31, -0.23,
      0.15, 0.28, -0.12, 0.11, 0.26, -0.19, 0.14, 0.29, -0.17, 0.08,
      0.22, -0.20, 0.16, 0.27, -0.15, 0.09, 0.25, -0.21, 0.18, 0.30,
      -0.13, 0.07, 0.24, -0.22, 0.19, 0.52, -0.16, 0.10, 0.26, -0.18,
      0.12, 0.28, -0.14, 0.06, 0.23, -0.19, 0.14, 0.29, -0.11, 0.05,
      0.21, -0.17, 0.13, 0.27, -0.10, 0.04, 0.20, -0.15, 0.11, 0.25,
      -0.09, 0.03, 0.19, -0.13, 0.10, 0.24, -0.08, 0.02, 0.18, -0.12,
      0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.22, 0.05, -0.06,
      0.09, 0.23, -0.07, 0.01, 0.17, -0.11, 0.08, 0.12
    ],
    "k": 5,
    "num_candidates": 100,
    "filter": {
      "term": {
        "category": "shoes"
      }
    }
  }
}

Similarity functions

The similarity function determines how the engine measures closeness between two vectors. Choose based on how your embedding model represents similarity.

Function	How it works	Best for
`l2_norm`	Euclidean distance—smaller values indicate greater similarity	General use: image recognition, facial recognition
`cosine`	Cosine similarity—values closer to 1 indicate greater similarity; not affected by vector length	Text semantic similarity
`dot_product`	Dot product—larger values indicate greater similarity	Recommendation systems where vector magnitude matters
`max_inner_product`	Same as `dot_product`, but does not require normalized vectors	Recommendation systems without vector normalization

RabitQGraph only supports l2_norm. Configuring any other similarity function with RabitQGraph results in an error.

Index configuration

All vector index settings go inside the index_options block of the field mapping.

PUT /<your_index_name>
{
  "mappings": {
    "properties": {
      "<your_vector_field>": {
        "type": "dense_vector",
        "dims": 768,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 32,
          "ef_construction": 400,
          "thread_count": 8
        }
      }
    }
  }
}

General parameters

Parameter	Type	Required	Default	Description
`type`	String	Yes	—	Must be `"havenask_native"` to activate FalconSeek.
`knn_type`	String	No	`"HNSW"`	Vector index algorithm. Options: `HNSW`, `RabitQGraph`, `QGraph`, `QC`, `Linear`.

Build parameters

These parameters take effect during index construction and determine index structure and quality. They cannot be changed without reindexing.

HNSW and QGraph

Parameter	Type	Default	Impact and recommendations
`m`	Integer	`16`	Maximum neighbors per node in the graph. Range: 4–128. A larger value increases recall and memory usage. Use `16` for low memory, `32` for balanced, `64`–`128` for high recall.
`ef_construction`	Integer	`200`	Width of the candidate search during graph construction. Range: 10–2000. A larger value improves index quality and recall but increases build time. Use `200` for fast builds, `400`–`500` for balanced, `800`+ for high-quality builds.

QGraph only

Parameter	Type	Default	Options
`quantizer`	String	None	`"int8"` (recommended, 8x compression), `"int4"` (higher compression), `"fp16"` (high precision, 2x compression), `"2bit"` (extreme compression)

Advanced parameters

Parameter	Type	Default	Description
`thread_count`	Integer	`1`	Threads used during index construction. Set to `0` to use all available CPU cores, or specify a value from 1 to 32.
`tags`	Array	`[]`	Low-cardinality keyword fields to enable `tags_filter` pre-filtering. Example: `["category", "brand_id"]`.
`linear_build_threshold`	Integer	`0`	If the document count falls below this value, the engine uses Linear (brute-force) search instead of building the configured index. Set to `1000` or `5000` for indexes that may start small.
`index_params`	Object	`{}`	Direct access to underlying engine parameters for advanced tuning. Parameters here override top-level parameters with the same name.

Important

index_params parameters override top-level parameters. For example, if you set "m": 32 at the top level and "proxima.hnsw.builder.max_neighbor_count": 48 inside index_params, the effective value is 48. Use index_params only when you need to tune parameters not exposed at the top level.

"index_options": {
  "knn_type": "HNSW",
  "m": 32,
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 48
  }
}

knn query parameters

Parameter	Type	Required	Description
`field`	String	Yes	The `dense_vector` field to search.
`query_vector`	Array	Yes	The query vector. Dimension must match the field's `dims`.
`k`	Integer	Yes	Number of nearest neighbors to return.
`num_candidates`	Integer	Recommended	Candidate set size per shard. Must be greater than `k`. A larger value increases recall but raises query latency.
`tags_filter`	String	No	High-performance pre-filter for HNSW and QGraph. See `tags_filter` pre-filtering.
`search_params`	Object	No	Runtime search parameter overrides. See Tuning search parameters at query time.

`tags_filter` pre-filtering

tags_filter is a pre-filtering mechanism optimized for HNSW and QGraph. Unlike the standard filter clause, tags_filter excludes non-matching nodes early during graph traversal, before distance computations. This makes it significantly faster for fields with a limited number of distinct values, such as category or brand ID.

When to use `tags_filter`:

The filter field has low cardinality (a small number of unique values).
Query performance is critical and the filter field is known at index creation time.

When to use standard `filter`:

The filter condition involves high-cardinality fields or complex expressions.
The algorithm is not HNSW or QGraph (that is, tags_filter is optimized for HNSW and QGraph).

Enable `tags_filter`

Step 1: Declare the filter fields in index_options using the tags parameter.

PUT /my_vector_index_with_tags
{
  "mappings": {
    "properties": {
      "product_vector": {
        "type": "dense_vector",
        "dims": 128,
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "tags": ["category"]
        }
      },
      "category": {
        "type": "keyword"
      }
    }
  }
}

Step 2: Use tags_filter in the knn query. The syntax is "field_name = value". Use | for OR and & for AND.

GET /my_vector_index_with_tags/_search
{
  "knn": {
    "field": "product_vector",
    "query_vector": [...],
    "k": 5,
    "tags_filter": "category = shoes | category = socks"
  }
}

Tuning search parameters at query time

Without rebuilding the index, temporarily adjust search parameters per query using search_params. This is useful when different workloads need different precision-latency trade-offs—for example, using a lower ef for real-time queries and a higher ef for batch analytics.

HNSW and QGraph

GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "proxima.hnsw.searcher.ef": "500",
      "proxima.hnsw.searcher.max_scan_ratio": "0.2"
    }
  }
}

QC

GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "proxima.qc.searcher.scan_ratio": "0.05"
    }
  }
}

RabitQGraph

GET vector_index/_search
{
  "knn": {
    "field": "vector",
    "query_vector": [0.1, 0.2, 0.3],
    "k": 10,
    "num_candidates": 100,
    "search_params": {
      "param.rabitQGraph.searcher.ef": "400",
      "param.rabitQGraph.searcher.max_scan_ratio": "0.1"
    }
  }
}

Parameter reference

HNSW

HNSW parameters use the proxima.hnsw. namespace. Set them via index_params.

Builder parameters

Parameter	Type	Default	Description
`proxima.hnsw.builder.max_neighbor_count`	uint32	100	Neighbors per node in the graph. A larger value improves accuracy but increases computation and storage. Maximum: 65535. Generally should not exceed the vector dimension.
`proxima.hnsw.builder.efconstruction`	uint32	500	Graph construction precision. A larger value produces a more accurate graph but takes longer.
`proxima.hnsw.builder.thread_count`	uint32	0	Construction threads. `0` uses all CPU cores.
`proxima.hnsw.builder.memory_quota`	uint64	0	Maximum memory for construction in bytes. The build fails if this limit is exceeded. Disk-based construction is not supported.
`proxima.hnsw.builder.scaling_factor`	uint32	50	Node ratio between graph layers. Range: [5, 1000]. Rarely needs adjustment.
`proxima.hnsw.builder.neighbor_prune_ratio`	float	0.5	Controls when edge pruning begins in the neighbor table. Rarely needs adjustment.
`proxima.hnsw.builder.upper_neighbor_ratio`	float	0.5	Upper-layer neighbor count relative to layer 0. Rarely needs adjustment.
`proxima.hnsw.builder.enable_adsampling`	bool	false	Accelerated distance sampling. Supports only Euclidean distance on fp32 data. Not recommended for dimensions below 256.
`proxima.hnsw.builder.slack_pruning_factor`	float	1.0	Controls pruning aggressiveness. Recommended range: [1.1, 1.2]. Use 1.1 for gist960 and sift128 datasets.

Searcher parameters

Parameter	Type	Default	Description
`proxima.hnsw.searcher.ef`	uint32	500	Candidate set size during search. A larger value increases recall but raises latency.
`proxima.hnsw.searcher.max_scan_ratio`	float	0.1	Maximum fraction of documents to scan. The search may stop before this ratio if the `ef` candidate set converges early.
`proxima.hnsw.searcher.neighbors_in_memory_enable`	bool	false	Keeps the neighbor table in memory for faster search. Increases memory usage.
`proxima.hnsw.searcher.check_crc_enable`	bool	false	Runs a cyclic redundancy check (CRC) on the index at load time. Increases load time when enabled.
`proxima.hnsw.searcher.visit_bloomfilter_enable`	bool	false	Uses a bloom filter to deduplicate visited graph nodes. Reduces memory but slightly degrades performance.
`proxima.hnsw.searcher.visit_bloomfilter_negative_prob`	float	0.001	Bloom filter false-positive rate. A smaller value is more accurate but uses more memory.
`proxima.hnsw.searcher.brute_force_threshold`	int	1000	If the total document count is below this value, the engine uses linear search instead of HNSW graph traversal.

Configuration examples

PUT /hnsw_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW"
        }
      }
    }
  }
}

PUT /hnsw_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 48,
          "ef_construction": 500,
          "thread_count": 8,
          "linear_build_threshold": 1000,
          "is_embedding_saved": true,
          "embedding_load_strategy": "ANN_INDEX_FILE",
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

RabitQGraph

Important

RabitQGraph only supports l2_norm similarity. Vector dimensions must be a positive integer multiple of 64.

RabitQGraph parameters use the param.rabitQGraph. namespace.

Builder parameters

Parameter	Type	Default	Description
`param.rabitQGraph.builder.neighbor_cnt`	uint32	128	Neighbors per node. Affects graph connectivity and search precision.
`param.rabitQGraph.builder.ef_construction`	uint32	512	Candidate node count during construction.
`param.rabitQGraph.builder.prune_ratio`	float	0.5	Neighbor pruning ratio for graph optimization.
`param.rabitQGraph.builder.cluster_count`	uint32	64	Number of cluster centroids for vector quantization.
`param.rabitQGraph.builder.quantized_bit_count`	uint32	1	Quantization bit depth. Allowed values: 1, 4, 5, 8, or 9.
`param.rabitQGraph.builder.slack_prune_factor`	float	1.0	Pruning policy factor.
`param.rabitQGraph.builder.repair_connectivity`	bool	true	Repairs graph connectivity after construction.
`param.rabitQGraph.builder.thread_count`	uint32	0	Construction threads. `0` uses all CPU cores.
`param.rabitQGraph.builder.ckpt_count`	uint32	0	Number of checkpoints for incremental builds.
`param.rabitQGraph.builder.ckpt_threshold`	uint32	2000000	Document count threshold that triggers a checkpoint.

Searcher parameters

Parameter	Type	Default	Description
`param.rabitQGraph.searcher.ef`	uint32	250	Candidate set size during search. A larger value increases recall but raises latency.
`param.rabitQGraph.searcher.max_scan_ratio`	double	0.05	Maximum fraction of nodes to scan.
`param.rabitQGraph.searcher.check_crc_enable`	bool	false	Enables CRC check at load time.
`param.rabitQGraph.searcher.thread_count`	uint32	1	Search threads.
`param.rabitQGraph.searcher.thread_safe_filter`	bool	false	Enables thread-safe filtering.

Configuration examples

{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 256,
    "param.rabitQGraph.builder.ef_construction": 512,
    "param.rabitQGraph.builder.quantized_bit_count": 8,
    "param.rabitQGraph.builder.cluster_count": 128,
    "param.rabitQGraph.builder.thread_count": 8,
    "param.rabitQGraph.searcher.ef": 300,
    "param.rabitQGraph.searcher.max_scan_ratio": 0.1
  }
}

{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 64,
    "param.rabitQGraph.builder.ef_construction": 200,
    "param.rabitQGraph.builder.quantized_bit_count": 1,
    "param.rabitQGraph.builder.cluster_count": 32,
    "param.rabitQGraph.searcher.ef": 150,
    "param.rabitQGraph.searcher.max_scan_ratio": 0.03
  }
}

{
  "index_params": {
    "param.rabitQGraph.builder.neighbor_cnt": 128,
    "param.rabitQGraph.builder.ef_construction": 400,
    "param.rabitQGraph.builder.quantized_bit_count": 4,
    "param.rabitQGraph.builder.cluster_count": 64,
    "param.rabitQGraph.searcher.ef": 250
  }
}

QGraph

QGraph (Quantized Graph) inherits all HNSW builder and searcher parameters and adds the following quantization parameters. Use quantizer at the top level for simple configuration, or use index_params for fine-grained control.

Additional builder parameters

Parameter	Type	Default	Options
`proxima.qgraph.builder.quantizer_class`	String	—	`Int8QuantizerConverter`, `Int4QuantizerConverter`, `HalfFloatConverter`, `DoubleBitConverter`
`proxima.qgraph.builder.quantizer_params`	Object	—	Quantizer-specific configuration.

Configuration examples

{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 32,
    "proxima.hnsw.builder.efconstruction": 400,
    "proxima.hnsw.builder.thread_count": 4,
    "proxima.qgraph.builder.quantizer_class": "Int8QuantizerConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 300
  }
}

{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 48,
    "proxima.hnsw.builder.efconstruction": 500,
    "proxima.hnsw.builder.thread_count": 6,
    "proxima.qgraph.builder.quantizer_class": "Int4QuantizerConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 400,
    "proxima.hnsw.searcher.max_scan_ratio": 0.1
  }
}

{
  "index_params": {
    "proxima.hnsw.builder.max_neighbor_count": 64,
    "proxima.hnsw.builder.efconstruction": 600,
    "proxima.hnsw.builder.thread_count": 8,
    "proxima.qgraph.builder.quantizer_class": "HalfFloatConverter",
    "proxima.qgraph.builder.quantizer_params": {},
    "proxima.hnsw.searcher.ef": 500
  }
}

QC

QC (Quantization Clustering) uses a clustering-based index. Parameters use the proxima.qc. namespace. Build time is longer than graph-based algorithms, but runtime memory usage is lower.

Builder parameters

Parameter	Type	Default	Description
`proxima.qc.builder.train_sample_count`	uint32	0	Training data size. `0` uses all documents.
`proxima.qc.builder.thread_count`	uint32	0	Construction threads. `0` uses all CPU cores.
`proxima.qc.builder.centroid_count`	String	—	Number of cluster centroids. Supports hierarchical clustering using `` as a separator, for example `"100100"`.
`proxima.qc.builder.cluster_class`	String	`OptKmeansCluster`	Clustering method.
`proxima.qc.builder.cluster_auto_tuning`	bool	false	Automatically tunes the centroid count.
`proxima.qc.builder.optimizer_class`	String	`HcBuilder`	Centroid optimizer for improving classification precision.
`proxima.qc.builder.optimizer_params`	IndexParams	—	Build and retrieval parameters for the optimizer.
`proxima.qc.builder.converter_class`	String	—	Automatically applies MIPS transformation when the similarity metric is inner product.
`proxima.qc.builder.converter_params`	IndexParams	—	Initialization parameters for `converter_class`.
`proxima.qc.builder.quantizer_class`	String	—	Quantizer type. Options: `Int8QuantizerConverter`, `Int4QuantizerConverter`, and others.
`proxima.qc.builder.quantizer_params`	IndexParams	—	Quantizer configuration.
`proxima.qc.builder.quantize_by_centroid`	bool	false	Quantizes vectors by centroid when using `quantizer_class`.
`proxima.qc.builder.store_original_features`	bool	false	Stores original (unquantized) vectors alongside the index.

Searcher parameters

Parameter	Type	Default	Description
`proxima.qc.searcher.scan_ratio`	float	0.01	Used to calculate max_scan_num: total doc count × scan_ratio.
`proxima.qc.searcher.optimizer_params`	IndexParams	—	Runtime retrieval parameters for the optimizer used during the build.
`proxima.qc.searcher.brute_force_threshold`	int	1000	If the total document count is below this value, the engine uses linear search.

Configuration examples

{
  "index_params": {
    "proxima.qc.builder.thread_count": 4,
    "proxima.qc.builder.centroid_count": "1000",
    "proxima.qc.builder.cluster_class": "OptKmeansCluster",
    "proxima.qc.searcher.scan_ratio": 0.02
  }
}

{
  "index_params": {
    "proxima.qc.builder.thread_count": 8,
    "proxima.qc.builder.centroid_count": "100*100",
    "proxima.qc.builder.optimizer_class": "HnswBuilder",
    "proxima.qc.builder.quantizer_class": "Int8QuantizerConverter",
    "proxima.qc.searcher.scan_ratio": 0.01
  }
}

{
  "index_params": {
    "proxima.qc.builder.thread_count": 12,
    "proxima.qc.builder.centroid_count": "2000",
    "proxima.qc.builder.train_sample_count": 100000,
    "proxima.qc.builder.store_original_features": true,
    "proxima.qc.searcher.scan_ratio": 0.05
  }
}

Linear

Linear performs a brute-force search across all vectors. It delivers 100% recall but query time grows linearly with document count. The engine selects Linear automatically when the document count is below brute_force_threshold (default 1,000) or linear_build_threshold.

Parameters use the proxima.linear. namespace.

Builder and searcher parameters

Parameter	Type	Default	Description
`proxima.linear.builder.column_major_order`	String	`"false"`	Storage order for vectors. `"false"` = row-major, `"true"` = column-major.
`proxima.linear.searcher.read_block_size`	uint32	1048576	Block size (in bytes) read into memory per search iteration. The recommended value is 1 MB (1048576).

Configuration examples

{
  "index_params": {
    "proxima.linear.builder.column_major_order": "false",
    "proxima.linear.searcher.read_block_size": 1048576
  }
}

{
  "index_params": {
    "proxima.linear.builder.column_major_order": "true",
    "proxima.linear.searcher.read_block_size": 2097152
  }
}

Appendix: Complete index configuration examples

HNSW

PUT /hnsw_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW"
        }
      }
    }
  }
}

PUT /hnsw_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "HNSW",
          "m": 48,
          "ef_construction": 500,
          "thread_count": 8,
          "linear_build_threshold": 1000,
          "is_embedding_saved": true,
          "embedding_load_strategy": "ANN_INDEX_FILE",
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

Linear

PUT /linear_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 384,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "Linear"
        }
      }
    }
  }
}

QC

PUT /qc_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QC"
        }
      }
    }
  }
}

PUT /qc_custom
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "dot_product",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QC",
          "thread_count": 8,
          "linear_build_threshold": 5000,
          "index_params": "{\"proxima.qc.builder.thread_count\": 8, \"proxima.qc.builder.centroid_count\": \"2000\"}"
        }
      }
    }
  }
}

QGraph

PUT /qgraph_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "quantizer": "int8"
        }
      }
    }
  }
}

PUT /qgraph_high_precision
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1536,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "m": 40,
          "ef_construction": 600,
          "thread_count": 8,
          "quantizer": "fp16",
          "is_embedding_saved": true,
          "index_load_strategy": "MEM"
        }
      }
    }
  }
}

PUT /qgraph_memory_optimized
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "QGraph",
          "m": 24,
          "ef_construction": 300,
          "thread_count": 6,
          "quantizer": "int4",
          "is_embedding_saved": false,
          "index_load_strategy": "BUFFER"
        }
      }
    }
  }
}

RabitQGraph

PUT /rabitqgraph_basic
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 64,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph"
        }
      }
    }
  }
}

PUT /rabitqgraph_performance
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 128,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "thread_count": 8,
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 256,
            "param.rabitQGraph.builder.ef_construction": 512,
            "param.rabitQGraph.builder.quantized_bit_count": 4,
            "param.rabitQGraph.builder.cluster_count": 128,
            "param.rabitQGraph.searcher.ef": 300
          }
        }
      }
    }
  }
}

PUT /rabitqgraph_memory_optimized
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 192,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "thread_count": 4,
          "linear_build_threshold": 1000,
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 64,
            "param.rabitQGraph.builder.ef_construction": 200,
            "param.rabitQGraph.builder.quantized_bit_count": 1,
            "param.rabitQGraph.builder.cluster_count": 32,
            "param.rabitQGraph.searcher.ef": 150,
            "param.rabitQGraph.searcher.max_scan_ratio": 0.05
          }
        }
      }
    }
  }
}

PUT /rabitqgraph_with_tags
{
  "mappings": {
    "properties": {
      "vector": {
        "type": "dense_vector",
        "dims": 256,
        "index": true,
        "similarity": "l2_norm",
        "index_options": {
          "type": "havenask_native",
          "knn_type": "RabitQGraph",
          "tags": ["category", "region"],
          "index_params": {
            "param.rabitQGraph.builder.neighbor_cnt": 128,
            "param.rabitQGraph.builder.ef_construction": 400,
            "param.rabitQGraph.builder.quantized_bit_count": 8,
            "param.rabitQGraph.searcher.ef": 250
          }
        }
      },
      "category": {
        "type": "keyword"
      },
      "region": {
        "type": "keyword"
      }
    }
  }
}

Use cases

Prerequisites

Choose an algorithm

Quick start

Step 1: Create an index

Step 2: Write documents

Step 3: Run a k-NN search

Step 4: Add a filter

Similarity functions

Index configuration

General parameters

Build parameters

HNSW and QGraph

QGraph only

Advanced parameters

knn query parameters

tags_filter pre-filtering

Enable tags_filter

Tuning search parameters at query time

HNSW and QGraph

QC

RabitQGraph

Parameter reference

HNSW

Builder parameters

Searcher parameters

Configuration examples

RabitQGraph

Builder parameters

Searcher parameters

Configuration examples

QGraph

Additional builder parameters

Configuration examples

QC

Builder parameters

Searcher parameters

Configuration examples

Linear

Builder and searcher parameters

Configuration examples

Appendix: Complete index configuration examples

HNSW

Linear

QC

QGraph

RabitQGraph

`tags_filter` pre-filtering

Enable `tags_filter`