Quantized clustering (QC) partitions a vector space into clusters and searches only the most relevant clusters at query time — reducing memory usage and improving query speed compared to scanning all vectors. Use this reference when configuring the index builder (QcBuilder) or the searcher (QcSearcher) for an OpenSearch Retrieval Engine Edition instance.
QcBuilder
QcBuilder parameters control how the quantized clustering index is built from your documents.
| Parameter | Type | Default value | Description |
|---|---|---|---|
qc.builder.train_sample_count | uint32 | 0 | Number of documents used as training data. Set to 0 to use all documents. |
qc.builder.thread_count | uint32 | 0 | Number of threads used during index building. Set to 0 to match the number of CPU cores of the instance. |
qc.builder.centroid_count | string | Optional | Number of centroids per cluster level. Supports hierarchical clusters — separate levels with an asterisk (*). For one level: 1000. For two levels: 100*100. For two-level hierarchical clusters, set more centroids at the first level than at the second level; the first level delivers 10x the search gain of the second level. Leave this parameter unset to let the system infer the appropriate count automatically. |
qc.builder.quantizer_class | string | — | Quantizer applied to vector data. Specifying a quantizer reduces index size and improves query performance, but may reduce retrieval accuracy in some cases. Valid values: Int8QuantizerConverter, HalfFloatConverter, DoubleBitConverter. |
qc.builder.quantize_by_centroid | bool | False | Whether to perform quantization relative to each centroid's local coordinate space. Takes effect only when qc.builder.quantizer_class is set to Int8QuantizerConverter. |
QcSearcher
QcSearcher parameters control how many clusters are scanned at query time. Adjust these to tune the speed-versus-accuracy trade-off without rebuilding the index.
| Parameter | Type | Default value | Description |
|---|---|---|---|
qc.searcher.scan_ratio | float | 0.01 | Maximum fraction of documents scanned per query. Used to derive max_scan_num with the formula: max_scan_num = total documents × scan_ratio. Increase this value to improve recall at the cost of higher query latency. |
qc.searcher.brute_force_threshold | int | 1000 | Document count below which linear retrieval is performed instead of cluster-based search. When the total number of documents is less than this value, the system scans all vectors directly. |
该文章对您有帮助吗?