Supported cluster and index configurations for retrieval-augmented generation applications-Elasticsearch(ES)-阿里云帮助中心

This topic describes the whitelists for cluster configurations, index configurations, processors, and queries supported by retrieval-augmented generation applications (Version 8.17) in Alibaba Cloud Elasticsearch Serverless.

Usage notes

To ensure service stability, Alibaba Cloud Elasticsearch Serverless (ES Serverless) imposes certain limits on application resources and usage. The whitelists in this topic describe the available features and configurations. However, some whitelisted configurations and features have parameter range constraints. If you exceed these ranges, you must adjust service quotas or submit a ticket to request access.

Cluster configuration whitelist

Cluster configurations define and control the behavior of an entire Elasticsearch cluster to ensure its stability, performance, security, and scalability. The following table lists the cluster configurations that you can modify for retrieval-augmented generation applications (Version 8.17).

Note

Alibaba Cloud Elasticsearch Serverless manages resources as applications and is not based on the concept of clusters. However, you can use cluster-level configurations from native Elasticsearch to adjust the application parameters for Elasticsearch Serverless. The term 'cluster configuration' is used here for consistency with native Elasticsearch. For more information about cluster-level configurations, see the GET/_cluster/settings/include_defaults API.

Configuration item	Description
action.destructive_requires_name	Controls whether you must specify an index name when you perform important operations. This prevents the accidental deletion of multiple indexes or large amounts of data. Valid values: Note High-risk operations are those that can cause data loss or make a cluster unavailable, such as deleting an index, shutting down a cluster, and purging a cache. `true`: You must specify an index name. The system rejects requests that use a wildcard character (``) or `_all` for fuzzy matching, such as `DELETE /my_`. `false` (default): You do not need to specify an index name. You can perform fuzzy matching.
apack.tenant.index_settings.throw_unsupported_setting	Controls whether an error is reported if an unsupported configuration parameter is used when you create or update an index. Valid values: `true`: An error is reported, and the operation to create or update the index fails. `false` (default): No error is reported. The system ignores the unsupported configuration parameter and continues to create or update the index.
action.auto_create_index	Controls whether to automatically create indexes. Valid values: `true` (default): Automatic creation is supported. When data is written to an index that does not exist, the system automatically creates the index. `false`: Automatic creation is not supported. When data is written to an index that does not exist, the system reports an error. You must manually create the index before you can write data to it.

Index configuration whitelist

Background information

In Elasticsearch, you can group documents with similar structures into the same index to quickly search and query specific content. The mapping and settings of an index are core operations that define its behavior and structure. Together, they determine how data is stored, queried, and managed.

Mapping: Defines the structure of documents in an index, including the data types of fields (such as text, integer, and date) and their behavior (such as whether to tokenize, store original values, or support sorting and aggregation).
Settings: Define the global behavior and related constraints at the index level. They control the storage, performance, and availability of the index, along with advanced data-related rules. These configurations directly affect how data is distributed, how searches are executed, and how resources are allocated.

This topic describes the whitelists for field data types and index configurations that are supported by retrieval-augmented generation applications (Version 8.17).

Note

For more information about how to create and configure an index, see the Update index settings API.

Supported field data types

Field data types determine how data is stored and queried in Elasticsearch. The following table lists the field data types supported by retrieval-augmented generation applications (Version 8.17).

Category	Field data type	Description
Numeric types	byte	An 8-bit signed integer that ranges from -128 to `127`. Use this type to store integers in a small range.
	short	A 16-bit signed integer that ranges from -32,768 to `32,767`. Use this type to store integers in a medium range.
	integer	A 32-bit signed integer that ranges from `-2<sup>31</sup>` to `2<sup>31</sup>-1`. Use this type to store general-purpose integers, such as inventory levels or order numbers.
	long	A 64-bit signed integer that ranges from `-2<sup>63</sup>` to `2<sup>63</sup>-1`. Use this type to store large integer values, such as UNIX timestamps or IP addresses.
	half_float	A 16-bit floating-point number with low precision. Use this type to store small floating-point values and save storage space.
	float	A 32-bit single-precision floating-point number. Use this type to store floating-point values with higher precision, such as temperatures or prices.
	double	A 64-bit double-precision floating-point number. Use this type to store high-precision data, such as for scientific or financial computing.
	scaled_float	A floating-point value that is compressed for storage to save space.
	unsigned_long	A 64-bit unsigned integer that ranges from `0` to `2<sup>64</sup>-1`. Use this type to store non-negative integers, such as very large IDs.
Range types	integer_range	Stores a range of integers, including a start value and an end value. Use this type for integer interval queries, such as for an age range.
	float_range	Stores a range of floating-point numbers, including a start value and an end value. Use this type for floating-point interval queries, such as for a temperature range or price range that includes decimals.
	long_range	Stores a range of long integers, including a start value and an end value. Use this type for large integer interval queries, such as querying for records with an order ID in a specified range.
	double_range	Stores a range of double-precision floating-point numbers, including a start value and an end value. Use this type for high-precision numerical interval queries, such as for a price fluctuation range in financial data like `199.99 ~ 999.99`.
	ip_range	Stores a range of IP addresses. Use this type for IP address segment matching, such as blocking IP addresses in a specified range.
	date_range	Stores a date range, including a start time and an end time. Use this type for time interval queries, such as for an event time or order validity period.
Sub-data types Note Used to handle nested or complex data structures.	object	Stores a JSON object that contains unstructured nested data, such as user configuration information.
	nested	A queryable nested object that supports independent indexing. Use this type to store multiple independent sub-objects, such as product reviews or order details.
Text and keyword types	text	Used for full-text search and supports tokenization. This type is suitable for unstructured content, such as product descriptions or article content.
	keyword	Used for exact matching and is not tokenized. This type is suitable for structured content, such as IDs or email addresses.
	constant_keyword	Stores a static field that has the same value in every document in an index. For example, a version number or an environment identifier like `production`.
	match_only_text	Suitable for full-text search scenarios where storage or highlighting is not required. This method saves storage space and improves search performance.
Date and time types	date	A date and time format that is accurate to the millisecond. Use this type to store time series data, such as log data, or event dates.
Date and time types	date_nanos	A date and time format that is accurate to the nanosecond. Use this type to store high-precision timestamps, such as for financial transactions.
Binary type	binary	Directly stores raw binary data, such as a binary log.
Boolean type	boolean	Stores a Boolean value (`true` or `false`). Use this type for conditional filtering or to check the status of a switch.
Geolocation types	geo_point	Stores geographic coordinates (latitude and longitude of a single point). Use this type for location point queries, such as for store coordinates.
	geo_shape	Stores complex geometric shapes, such as polygons. Use this type for area matching, such as for administrative region boundary queries.
	point	Stores geographic coordinates (latitude and longitude of a single point). Use this type for location point queries. It is often used for foreign key fields, such as associating an order with a delivery address.
IP and network type	ip	Used for storing and querying IPv4 or IPv6 addresses.
Vector types	dense_vector	Stores dense vectors. This type is often used in machine learning for things like image vectors or text embeddings.
Vector types	sparse_vector	Stores sparse vectors by storing only non-zero values and their indexes. This type is often used for high-dimensional sparse data, such as text features.
Other types	flattened	Stores nested objects or arrays. Use this type for unconventional aggregations or storing unstructured data.
	wildcard	Performs text-matching queries based on wildcard characters to find text that matches a specific pattern in a field. For example, use `"?at"` to match `cat` or `hat`.
	shape	Stores geospatial geometric shapes, such as polygons, lines, and areas. Use this type to handle complex geographic area ranges and spatial relationship queries.
	alias	An alias can point to one or more indexes. You can then use a single alias to query data from multiple indexes at once. For example, you can set the alias for indexes like `log_2025_01` and `log_2025_02` to `all_logs`. Querying `all_logs` retrieves all related log data.
	version	Controls the concurrency of document updates and data consistency. It is often used for conflict detection to prevent data overwrites or inconsistencies caused by multiple clients modifying the same document simultaneously.
	search_as_you_type	Provides real-time search suggestions. As you type a search query character by character, the system immediately returns relevant, partially matched results.
	semantic_text	Automatically converts text content into semantic embeddings through an inference endpoint and intelligently chunks long text for efficient processing of large-scale corpora. By encapsulating the complex vectorization process into a field-level feature, it significantly lowers the technical barrier to building intelligent semantic search applications.

Supported index configurations

The supported index configurations for retrieval-augmented generation applications (Version 8.17) are divided into mapping limit configurations and general index configurations. You can configure them as needed.

Mapping limit configurations

Index mapping limit configurations are used to restrict the number of field mappings and prevent an excessive number of mapped fields from degrading index performance. The following table lists the mapping limit configurations supported by retrieval-augmented generation applications (Version 8.17).

Configuration item	Description
index.mapping.field_name_length.limit	Sets the maximum length of a field name to prevent long field names from consuming too much memory.
index.mapping.depth.limit	Sets the maximum depth of a `nested` field to limit the complexity of nested objects and prevent memory overflow. The default value is `20`.
index.mapping.nested_fields.limit	Sets the maximum number of `nested` fields to prevent an excessive number of nested mappings from degrading query performance. The default value is `50`.
index.mapping.nested_objects.limit	Sets the maximum number of nested objects in a single document to prevent documents from containing too many nested objects, which can lead to insufficient memory. The default value is `10,000`.
index.mapping.total_fields.limit	Sets the maximum number of fields in an index to prevent an excessive number of fields from causing index bloat and affecting query performance. The default value is `1,000`.
index.mapping.coerce	Specifies whether to coerce the data type of an input value, such as converting the string `"123"` to the numeric value `123`. Valid values: `true`: Coercion is enabled. The input value of the current field is converted to the type defined for that field. `false` (default): Coercion is disabled. The input value is stored in its original type. A type mismatch may cause an error.
index.mapping.ignore_malformed	Specifies whether to ignore data that does not conform to the field type. Valid values: `true`: Malformed data is ignored. Data with an incorrect type can be inserted, but it will not be indexed. `false` (default): Malformed data is not ignored. Data with an incorrect type cannot be inserted, and an error is reported, which affects the insertion of other data.

General index configurations

General index configurations define the underlying behavior and resource allocation of an index, such as the number of shards and tokenization rules. Proper index configuration helps improve query and write performance, increase resource utilization, and efficiently manage data. The following table lists the general index configurations supported by retrieval-augmented generation applications (Version 8.17).

Category	Configuration item	Description
Core index configurations	index.number_of_shards	Sets the number of primary shards for an index. This controls data distribution and parallel processing capabilities. The default value is `1`.
	index.codec	Sets the compression format used to store data. Valid values: `default` (default): Medium compression. Suitable for scenarios that require fast retrieval and have sufficient storage space. `best_compression`: Very high compression. Suitable for scenarios where indexes are read infrequently or storage is limited.
	index.refresh_interval	Sets the interval at which shard data is refreshed from memory to disk. This adjusts write performance and query real-time capabilities. The default value is `1s`.
Analyzers and pipelines	index.analysis.*	Customizes analyzers, tokenizers, and filters to control the tokenization rules for text fields.
	index.default_pipeline	The default data pipeline used when inserting documents. It is used for system pre-processing of data, such as format conversion or field calculation.
	index.final_pipeline	The final processing pipeline after a document is inserted. It is used to ensure that data complies with specifications, such as by filtering encrypted or sensitive fields.
Query and sorting	index.query.default_field	Defines which fields are matched by default when a query string does not specify a field. For example, it can automatically match the query statement `query: "text"` to the `content` field.
Query and sorting	index.sort.*	Defines the sorting rules for documents within a shard to accelerate queries, such as sorting by time.
Index blocking	index.blocks.*	Restricts the types of operations that can be performed on an index, such as write, delete, and metadata access. This is used to temporarily lock an index for maintenance or debugging.
Index performance	index.max_adjacency_matrix_filters	Sets the complexity threshold for adjacency matrix queries to prevent memory overflow or performance degradation due to too many or complex conditions. The default value is `1,000`.
	index.max_docvalue_fields_search	The maximum number of `docvalue_fields` allowed in a single search request. This is used to limit the resource consumption of aggregation queries. The default value is `100`. Note `docvalue_fields` is a search parameter used to return the original values of specified fields in the search results.
	index.max_inner_result_window	Sets the maximum result window size for `inner_hits` in a nested query or aggregation to limit the memory usage of complex queries. The default value is `100`. Note `inner_hits` is used to retrieve specific information about nested documents in the context of a nested query.
	index.max_ngram_diff	Sets the maximum difference in length between n-grams in an Ngram tokenizer to prevent index bloat or performance degradation due to improper tokenizer configuration.
	index.max_refresh_listeners	Sets the maximum number of refresh listeners for a single index to prevent an excessive number of listeners from affecting refresh performance.
	index.max_regex_length	Sets the maximum length of a regular expression or prefix query to prevent long content from affecting query speed. The default value is `1,000`.
	index.max_rescore_window	Sets the maximum document window size for a rescore operation, which reorders results using a more complex scoring model, to limit the resource consumption of the process. The default value is `10,000`.
	index.max_result_window	Sets the maximum result window size for paged queries to prevent high-overhead paging, such as `from: 100000, size: 10`. The default value is `10,000`.
	index.max_script_fields	Sets the maximum number of script fields in a single search request to limit the resource consumption of script calculations. The default value is `32`.
	index.max_shingle_diff	Controls the maximum difference in length between shingles in a Shingle tokenizer to prevent the tokenizer from generating too many combined words. The default value is `3`.
	index.max_terms_count	Sets the maximum number of conditions that can be added in a single query or aggregation to prevent an overly large condition list from causing insufficient memory and affecting performance. The default value is `65,536`.
	index.max_prefix_length	Sets the maximum length of a prefix query to prevent excessively long prefixes from affecting query performance. The default value is `0`, which means there is no limit.
	index.max_wildcard_length	Sets the maximum length of a wildcard query to limit its complexity and avoid full table scans. The default value is `10`.

Processor whitelist

Before a document is indexed in Elasticsearch, you can use a processor to transform, clean, enrich, or filter its content. The following table lists the Ingest Pipelines Processor whitelist for retrieval-augmented generation applications (Version 8.17).

Processor type	Processor	Description
Basic data operations	append	Appends one or more values to a field, which must be an array.
	set	Sets the value of a field. If the field does not exist, it is created. If the field exists, its value is overwritten.
	remove	Deletes one or more fields.
	rename	Renames a field.
	drop	Discards the entire document. The document is not indexed.
	fail	Forcibly terminates the processor, breaks the pipeline, and returns an error.
	pipeline	Calls another Ingest Pipeline.
	terminate	Immediately terminates the pipeline execution. Subsequent processors do not run.
Data type conversion and formatting	convert	Converts the data type of a field.
	bytes	Converts a byte size string, such as `1kb`, to the number of bytes (an integer).
	date	Parses a time string and sets it as the value of `@timestamp` or another field.
	sort	Sorts the array values in a field.
String and text processing	trim	Removes leading and trailing whitespace from a string.
	split	Splits a string into an array based on a separator.
	join	Merges an array into a string.
	uppercase	Converts a string to uppercase.
	lowercase	Converts a string to lowercase.
	gsub	Replaces content in a string using a regular expression.
	urldecode	Decodes a URL-encoded string.
	html_strip	Removes HTML tags.
Structured parsing	dissect	Extracts fields using simple pattern matching.
	grok	Parses unstructured text using regular expression patterns.
	kv	Extracts fields from a key-value pair string.
	json	Parses a JSON string into an object field.
	csv	Parses a CSV string into multiple fields.
Geo and network information processing	geo_grid	Converts latitude and longitude to a geographic grid. This is often used for aggregations.
	ip_location	Resolves an IP address to a geographic location. This requires integration with GeoIP.
	network_direction	Determines the direction of network traffic.
	user_agent	Parses a User-Agent string to extract information such as the browser and operating system.
	uri_parts	Parses a URI to extract information such as the protocol, host, and path.
Data security	redact	Masks sensitive data in a field, such as hiding the middle digits of a phone number.
Data security	fingerprint	Generates a unique hash value for the content of a field. This is used to remove duplicates.
Advanced processors	attachment	Extracts text from file attachments, such as PDF or Word files.
	circle	Converts a circular area (centroid and radius) into a geographic shape for geographic queries.
	community_id	Generates a `community ID` for a network flow. This is used for network traffic analysis.
	dot_expand	Expands a field name with dot notation into a nested object.
	for_each	Executes a set of processors for each element in an array.
Script and AI processing	script	Uses the Painless scripting language to write complex logic, such as conditional statements, mathematical operations, and dynamic field generation. Note To use custom scripts, you must submit a ticket to add them to the whitelist.
Script and AI processing	inference	Calls a deployed machine learning model, such as an NLP model, for inference.

Query whitelist

Query section whitelist

The following table lists the parameters supported by the outermost JSON object of a search request body in retrieval-augmented generation applications (Version 8.17).

Parameter	Description
retriever	Builds a retrieval-augmented generation (RAG) system. It supports hybrid retrieval (keyword and vector) and simplifies the semantic search process.
terminate_after	Limits the maximum number of hits.
min_score	Filters results based on a minimum score.
_source	Controls whether the `_source` field is returned.
stored_fields	Returns stored fields.
query	The main query condition.
post_filter	Filters results after the query. This does not affect aggregations.
knn	Performs an approximate nearest neighbor (ANN) search for vectors.
script_fields	Returns calculated fields from a script. Note To use custom scripts, you must submit a ticket to add them to the whitelist.
indices_boost	Assigns weights to different indexes.
aggs / aggregations	Performs aggregation and analysis.
highlight	Highlights matching content.
rescore	A rescoring mechanism.
slice	Querying index segments
collapse	Merges and removes duplicates from multiple repetitive or similar query conditions.
pit	Performs a point in time search.
docvalue_fields	Returns `doc_values` fields.
fields	Returns specified fields.
search_after	Performs deep paging.
runtime_mappings	Defines runtime fields to dynamically calculate field values at query time without re-indexing. This is often used for log parsing, field transformation, and sensitive data masking. Note Some `type` values are not yet supported. To use them, submit a ticket to add them to the whitelist.

Query type whitelist

The following query types are supported by retrieval-augmented generation applications (Version 8.17).

Compound queries

Query type	Description
bool	Combines multiple query clauses. It supports Boolean logic such as `must` and `should`.
boosting	Lowers the score of documents that meet a certain condition, but still returns them.
constant_score	Gives all matching documents the same score. This is often used in filtering scenarios.
dis_max	Takes the highest score from multiple queries as the final score to avoid excessively high score accumulation.
function_score	Provides full control over the scoring of documents. It can be combined with scripts, randomness, and other factors for scoring.

Full-text queries

Query type	Description
intervals	Provides precise control over the position and order of words for complex text pattern matching.
match	A standard full-text search that tokenizes the input text and then performs a match.
match_bool_prefix	Matches a prefix. This is used in auto-completion scenarios.
match_phrase	Performs a phrase match, which requires words to be in the same order and adjacent.
match_phrase_prefix	Performs a phrase match and supports prefix matching for the last word.
combined_fields	Merges multiple fields into a virtual field for searching to improve cross-field relevance.
multi_match	Executes a `match` query on multiple fields.
query_string	Supports complex query syntax, such as `AND`, `OR`, `NOT`, `+`, and `-`.
simple_query_string	A simplified version of `query_string` with more user-friendly syntax and better fault tolerance.
common	Optimizes searches on long text by automatically distinguishing between high-frequency and low-frequency keywords.

Geo queries
Query type
Description
geo_distance
Filters geographic data that falls within a circular area defined by a centroid (latitude and longitude) and a distance range.

Shape queries

Query type	Description
shape	Determines whether a geographic shape in a document, such as a polygon or line, has a certain spatial relationship with a specified query shape, such as intersects or contains.

Joining queries
Query type
Description
nested
Used for exact matching of nested objects.
Match all
Query type
Description
match_all
Matches all documents in an index.

Span queries

Query type	Description
span_containing	Matches a span query that is contained within another span query.
span_first	Matches a span query and requires it to appear within the first N words of a field.
span_multi	Enables fuzzy matching for `span_term`. It is usually used with `wildcard`, `regexp`, or `prefix`.
span_near/span_gap	Matches multiple words and requires them to appear within a certain distance of each other, such as allowing three words in between. The order can be controlled.
span_not	Excludes the results of one span query from the results of another span query.
span_or	Matches if any of multiple span queries match.
span_term	Matches an exact word. It is similar to a `term` query but is used in a span query chain.
span_within	Matches a span query that is completely within the range of another span query.

Vector queries

Query type	Description
knn	The primary way that Elasticsearch implements dense vector search. It finds the `k` most similar documents by calculating the similarity between vectors, such as cosine similarity.
sparse_vector	Stores keyword weights to improve computational efficiency.
text_expansion	Expands a query into a sparse vector to improve recall rate.

Specialized queries

Query type	Description
distance_feature	Boosts the score of documents that meet a condition based on a time or geographic distance.
script	Allows you to use the Painless scripting language to write custom Boolean logic to determine whether a document matches. Note To use custom scripts, you must submit a ticket to add them to the whitelist.
script_score	`script_score` is part of `function_score` and lets you fully customize the `_score` of a document for personalized sorting. Note To use custom scripts, you must submit a ticket to add them to the whitelist.
pinned	Allows you to manually specify that certain document IDs must appear in the search results and control their positions.

Term-level queries

Query type	Description
exists	Matches documents that contain the field and have a non-`null` value.
fuzzy	Matches a `term` with a similar spelling, based on the Levenshtein edit distance.
ids	Finds documents by their exact `_id`.
prefix	Matches a `term` that starts with a specified prefix.
range	Matches a range of numeric, date, or string values.
regexp	Matches a `term` using a regular expression.
term	Matches an exact `term`. The match must be exact and is case-sensitive.
terms	Matches if the field value is any of the values in a given list.
wildcard	Supports fuzzy matching with `*` (any character) and `?` (a single character).

Hybrid retrieval (Retriever rrf)
Query type
Description
rank_rrf
Enables Reciprocal Rank Fusion (RRF) sorting in a search request.

Aggregation query type whitelist

The following aggregation query types are supported by retrieval-augmented generation applications (Version 8.17).

Bucket aggregations

Classification	Query type	Description
Basic types	terms	Groups documents by field value, such as by status or region.
	multi_terms	Groups documents by a combination of multiple fields (composite primary key).
	histogram	Groups documents by numerical interval.
	date_histogram	Groups documents by time interval, such as per hour or per day.
	auto_date_histogram	Automatically selects an appropriate time interval.
	variable_width_histogram	A variable-width histogram used to handle uneven data distribution or many extreme values.
	range	Groups documents by custom numeric range.
	date_range	Groups documents by custom date range.
	ip_range	Group by IP address segment.
	ip_prefix	Grouped by IP prefix.
Geospatial types	geo_distance	Groups documents by distance from a point, such as `0-1km` or `1-5km`.
	geohash_grid	Groups documents by Geohash grid (latitude and longitude grid).
	geotile_grid	Groups documents by Google S2 or XYZ tile grid (map tiles).
	geohex_grid	Groups documents by hexagonal grid.
Filtering and sampling types	filter	Creates a bucket that contains only the documents that match the query.
	filters	Creates multiple buckets, each corresponding to a filter condition.
	missing	Groups all documents with missing fields into one bucket.
	sampler	Randomly samples a portion of documents for subsequent aggregation.
	diversified_sampler	Ensures that the values of a specific field are not repeated during sampling.
	random_sampler	Performs random sampling based on a sample.
Nested types	nested	Performs aggregation within a `nested` object.
	reverse_nested	Returns from a `nested` document to the root document context.
	children	Aggregates child documents in a parent-child relationship.
	parent	Aggregates parent documents from child documents in a parent-child relationship.
Advanced analysis types	significant_terms	Finds terms with high significance.
	significant_text	Performs significance analysis on a text field.
	rare_terms	Finds low-frequency values.
	frequent_item_sets	Finds combinations of fields that frequently appear together.
	categorize_text	Automatically categorizes unstructured text.
Other types	adjacency_matrix	Builds a cross-matrix of Boolean conditions for debugging filters.
	aggs/aggregations	Syntax keywords for nested aggregations.
	composite	Buckets documents by a combination of dimensions. It supports deep paging and is used to export full data.
	global	Creates a global bucket that ignores all filters.
	time_series	Performs efficient aggregation on time series data.

Metrics aggregations

Classification	Query type	Description
Basic statistics types	avg	Calculates the average value.
	sum	Calculates the sum.
	min	Minimum value.
	max	(maximum)
	value_count	Counts the number of field values, including duplicates.
	cardinality	Counts the number of unique values.
Advanced statistics types	stats	Returns `count`, `min`, `max`, `avg`, and `sum` in a single response.
	extended_stats	Adds `sum_of_squares`, `variance`, and `std_deviation` to the data returned by `stats`.
	percentiles	Calculates percentiles.
	percentile_ranks	The percentile rank corresponding to a given value.
	median_absolute_deviation	The median absolute deviation, which measures data dispersion.
Geospatial types	geo_bounds	Calculates the bounding box of geographic points.
	geo_centroid	Calculates the centroid of geographic points.
	geo_line	Connects geographic points to form a line.
	cartesian_bounds	The bounding box of Cartesian coordinates.
	cartesian_centroid	The centroid of Cartesian coordinates.
Data structure types	top_hits	Returns the top matching documents within a bucket.
	top_metrics	Returns the values of other fields that correspond to the optimal value of a certain metric.
	boxplot	Returns the five values required for a box plot chart: `min`, `q1`, `median`, `q3`, and `max`.
Other types	matrix_stats	Calculates the statistical relationships between multiple fields, such as the mean, covariance, and correlation coefficient matrix.
	rate	Calculates the growth rate or rate.
	string_stats	Calculates statistics for a text field, such as length and character types.
	t_test	Performs a `t`-test to determine if there is a significant difference between the means of two sets of data.
	weighted_avg	Weighted average value.

Query type	Description
match_all	Matches all documents in an index.

Query type	Description
rank_rrf	Enables Reciprocal Rank Fusion (RRF) sorting in a search request.