Appendix 2: Whitelists

更新时间:
复制 MD 格式

This topic describes the whitelists for cluster configurations, index configurations, processors, and queries supported by retrieval-augmented generation applications (Version 8.17) in Alibaba Cloud Elasticsearch Serverless.

Usage notes

To ensure service stability, Alibaba Cloud Elasticsearch Serverless (ES Serverless) imposes certain limits on application resources and usage. The whitelists in this topic describe the available features and configurations. However, some whitelisted configurations and features have parameter range constraints. If you exceed these ranges, you must adjust service quotas or submit a ticket to request access.

Cluster configuration whitelist

Cluster configurations define and control the behavior of an entire Elasticsearch cluster to ensure its stability, performance, security, and scalability. The following table lists the cluster configurations that you can modify for retrieval-augmented generation applications (Version 8.17).

Note

Alibaba Cloud Elasticsearch Serverless manages resources as applications and is not based on the concept of clusters. However, you can use cluster-level configurations from native Elasticsearch to adjust the application parameters for Elasticsearch Serverless. The term 'cluster configuration' is used here for consistency with native Elasticsearch. For more information about cluster-level configurations, see the GET/_cluster/settings/include_defaults API.

Configuration item

Description

action.destructive_requires_name

Controls whether you must specify an index name when you perform important operations. This prevents the accidental deletion of multiple indexes or large amounts of data. Valid values:

Note

High-risk operations are those that can cause data loss or make a cluster unavailable, such as deleting an index, shutting down a cluster, and purging a cache.

  • true: You must specify an index name. The system rejects requests that use a wildcard character (*) or _all for fuzzy matching, such as DELETE /my_*.

  • false (default): You do not need to specify an index name. You can perform fuzzy matching.

apack.tenant.index_settings.throw_unsupported_setting

Controls whether an error is reported if an unsupported configuration parameter is used when you create or update an index. Valid values:

  • true: An error is reported, and the operation to create or update the index fails.

  • false (default): No error is reported. The system ignores the unsupported configuration parameter and continues to create or update the index.

action.auto_create_index

Controls whether to automatically create indexes. Valid values:

  • true (default): Automatic creation is supported. When data is written to an index that does not exist, the system automatically creates the index.

  • false: Automatic creation is not supported. When data is written to an index that does not exist, the system reports an error. You must manually create the index before you can write data to it.

Index configuration whitelist

Background information

In Elasticsearch, you can group documents with similar structures into the same index to quickly search and query specific content. The mapping and settings of an index are core operations that define its behavior and structure. Together, they determine how data is stored, queried, and managed.

  • Mapping: Defines the structure of documents in an index, including the data types of fields (such as text, integer, and date) and their behavior (such as whether to tokenize, store original values, or support sorting and aggregation).

  • Settings: Define the global behavior and related constraints at the index level. They control the storage, performance, and availability of the index, along with advanced data-related rules. These configurations directly affect how data is distributed, how searches are executed, and how resources are allocated.

This topic describes the whitelists for field data types and index configurations that are supported by retrieval-augmented generation applications (Version 8.17).

Note

For more information about how to create and configure an index, see the Update index settings API.

Supported field data types

Field data types determine how data is stored and queried in Elasticsearch. The following table lists the field data types supported by retrieval-augmented generation applications (Version 8.17).

Category

Field data type

Description

Numeric types

byte

An 8-bit signed integer that ranges from -128 to 127. Use this type to store integers in a small range.

short

A 16-bit signed integer that ranges from -32,768 to 32,767. Use this type to store integers in a medium range.

integer

A 32-bit signed integer that ranges from -2<sup>31</sup> to 2<sup>31</sup>-1. Use this type to store general-purpose integers, such as inventory levels or order numbers.

long

A 64-bit signed integer that ranges from -2<sup>63</sup> to 2<sup>63</sup>-1. Use this type to store large integer values, such as UNIX timestamps or IP addresses.

half_float

A 16-bit floating-point number with low precision. Use this type to store small floating-point values and save storage space.

float

A 32-bit single-precision floating-point number. Use this type to store floating-point values with higher precision, such as temperatures or prices.

double

A 64-bit double-precision floating-point number. Use this type to store high-precision data, such as for scientific or financial computing.

scaled_float

A floating-point value that is compressed for storage to save space.

unsigned_long

A 64-bit unsigned integer that ranges from 0 to 2<sup>64</sup>-1. Use this type to store non-negative integers, such as very large IDs.

Range types

integer_range

Stores a range of integers, including a start value and an end value. Use this type for integer interval queries, such as for an age range.

float_range

Stores a range of floating-point numbers, including a start value and an end value. Use this type for floating-point interval queries, such as for a temperature range or price range that includes decimals.

long_range

Stores a range of long integers, including a start value and an end value. Use this type for large integer interval queries, such as querying for records with an order ID in a specified range.

double_range

Stores a range of double-precision floating-point numbers, including a start value and an end value. Use this type for high-precision numerical interval queries, such as for a price fluctuation range in financial data like 199.99 ~ 999.99.

ip_range

Stores a range of IP addresses. Use this type for IP address segment matching, such as blocking IP addresses in a specified range.

date_range

Stores a date range, including a start time and an end time. Use this type for time interval queries, such as for an event time or order validity period.

Sub-data types

Note

Used to handle nested or complex data structures.

object

Stores a JSON object that contains unstructured nested data, such as user configuration information.

nested

A queryable nested object that supports independent indexing. Use this type to store multiple independent sub-objects, such as product reviews or order details.

Text and keyword types

text

Used for full-text search and supports tokenization. This type is suitable for unstructured content, such as product descriptions or article content.

keyword

Used for exact matching and is not tokenized. This type is suitable for structured content, such as IDs or email addresses.

constant_keyword

Stores a static field that has the same value in every document in an index. For example, a version number or an environment identifier like production.

match_only_text

Suitable for full-text search scenarios where storage or highlighting is not required. This method saves storage space and improves search performance.

Date and time types

date

A date and time format that is accurate to the millisecond. Use this type to store time series data, such as log data, or event dates.

date_nanos

A date and time format that is accurate to the nanosecond. Use this type to store high-precision timestamps, such as for financial transactions.

Binary type

binary

Directly stores raw binary data, such as a binary log.

Boolean type

boolean

Stores a Boolean value (true or false). Use this type for conditional filtering or to check the status of a switch.

Geolocation types

geo_point

Stores geographic coordinates (latitude and longitude of a single point). Use this type for location point queries, such as for store coordinates.

geo_shape

Stores complex geometric shapes, such as polygons. Use this type for area matching, such as for administrative region boundary queries.

point

Stores geographic coordinates (latitude and longitude of a single point). Use this type for location point queries. It is often used for foreign key fields, such as associating an order with a delivery address.

IP and network type

ip

Used for storing and querying IPv4 or IPv6 addresses.

Vector types

dense_vector

Stores dense vectors. This type is often used in machine learning for things like image vectors or text embeddings.

sparse_vector

Stores sparse vectors by storing only non-zero values and their indexes. This type is often used for high-dimensional sparse data, such as text features.

Other types

flattened

Stores nested objects or arrays. Use this type for unconventional aggregations or storing unstructured data.

wildcard

Performs text-matching queries based on wildcard characters to find text that matches a specific pattern in a field. For example, use "?at" to match cat or hat.

shape

Stores geospatial geometric shapes, such as polygons, lines, and areas. Use this type to handle complex geographic area ranges and spatial relationship queries.

alias

An alias can point to one or more indexes. You can then use a single alias to query data from multiple indexes at once.

For example, you can set the alias for indexes like log_2025_01 and log_2025_02 to all_logs. Querying all_logs retrieves all related log data.

version

Controls the concurrency of document updates and data consistency. It is often used for conflict detection to prevent data overwrites or inconsistencies caused by multiple clients modifying the same document simultaneously.

search_as_you_type

Provides real-time search suggestions. As you type a search query character by character, the system immediately returns relevant, partially matched results.

semantic_text

Automatically converts text content into semantic embeddings through an inference endpoint and intelligently chunks long text for efficient processing of large-scale corpora. By encapsulating the complex vectorization process into a field-level feature, it significantly lowers the technical barrier to building intelligent semantic search applications.

Supported index configurations

The supported index configurations for retrieval-augmented generation applications (Version 8.17) are divided into mapping limit configurations and general index configurations. You can configure them as needed.

Mapping limit configurations

Index mapping limit configurations are used to restrict the number of field mappings and prevent an excessive number of mapped fields from degrading index performance. The following table lists the mapping limit configurations supported by retrieval-augmented generation applications (Version 8.17).

Configuration item

Description

index.mapping.field_name_length.limit

Sets the maximum length of a field name to prevent long field names from consuming too much memory.

index.mapping.depth.limit

Sets the maximum depth of a nested field to limit the complexity of nested objects and prevent memory overflow. The default value is 20.

index.mapping.nested_fields.limit

Sets the maximum number of nested fields to prevent an excessive number of nested mappings from degrading query performance. The default value is 50.

index.mapping.nested_objects.limit

Sets the maximum number of nested objects in a single document to prevent documents from containing too many nested objects, which can lead to insufficient memory. The default value is 10,000.

index.mapping.total_fields.limit

Sets the maximum number of fields in an index to prevent an excessive number of fields from causing index bloat and affecting query performance. The default value is 1,000.

index.mapping.coerce

Specifies whether to coerce the data type of an input value, such as converting the string "123" to the numeric value 123. Valid values:

  • true: Coercion is enabled. The input value of the current field is converted to the type defined for that field.

  • false (default): Coercion is disabled. The input value is stored in its original type. A type mismatch may cause an error.

index.mapping.ignore_malformed

Specifies whether to ignore data that does not conform to the field type. Valid values:

  • true: Malformed data is ignored. Data with an incorrect type can be inserted, but it will not be indexed.

  • false (default): Malformed data is not ignored. Data with an incorrect type cannot be inserted, and an error is reported, which affects the insertion of other data.

General index configurations

General index configurations define the underlying behavior and resource allocation of an index, such as the number of shards and tokenization rules. Proper index configuration helps improve query and write performance, increase resource utilization, and efficiently manage data. The following table lists the general index configurations supported by retrieval-augmented generation applications (Version 8.17).

Category

Configuration item

Description

Core index configurations

index.number_of_shards

Sets the number of primary shards for an index. This controls data distribution and parallel processing capabilities. The default value is 1.

index.codec

Sets the compression format used to store data. Valid values:

  • default (default): Medium compression. Suitable for scenarios that require fast retrieval and have sufficient storage space.

  • best_compression: Very high compression. Suitable for scenarios where indexes are read infrequently or storage is limited.

index.refresh_interval

Sets the interval at which shard data is refreshed from memory to disk. This adjusts write performance and query real-time capabilities. The default value is 1s.

Analyzers and pipelines

index.analysis.*

Customizes analyzers, tokenizers, and filters to control the tokenization rules for text fields.

index.default_pipeline

The default data pipeline used when inserting documents. It is used for system pre-processing of data, such as format conversion or field calculation.

index.final_pipeline

The final processing pipeline after a document is inserted. It is used to ensure that data complies with specifications, such as by filtering encrypted or sensitive fields.

Query and sorting

index.query.default_field

Defines which fields are matched by default when a query string does not specify a field. For example, it can automatically match the query statement query: "text" to the content field.

index.sort.*

Defines the sorting rules for documents within a shard to accelerate queries, such as sorting by time.

Index blocking

index.blocks.*

Restricts the types of operations that can be performed on an index, such as write, delete, and metadata access. This is used to temporarily lock an index for maintenance or debugging.

Index performance

index.max_adjacency_matrix_filters

Sets the complexity threshold for adjacency matrix queries to prevent memory overflow or performance degradation due to too many or complex conditions. The default value is 1,000.

index.max_docvalue_fields_search

The maximum number of docvalue_fields allowed in a single search request. This is used to limit the resource consumption of aggregation queries. The default value is 100.

Note

docvalue_fields is a search parameter used to return the original values of specified fields in the search results.

index.max_inner_result_window

Sets the maximum result window size for inner_hits in a nested query or aggregation to limit the memory usage of complex queries. The default value is 100.

Note

inner_hits is used to retrieve specific information about nested documents in the context of a nested query.

index.max_ngram_diff

Sets the maximum difference in length between n-grams in an Ngram tokenizer to prevent index bloat or performance degradation due to improper tokenizer configuration.

index.max_refresh_listeners

Sets the maximum number of refresh listeners for a single index to prevent an excessive number of listeners from affecting refresh performance.

index.max_regex_length

Sets the maximum length of a regular expression or prefix query to prevent long content from affecting query speed. The default value is 1,000.

index.max_rescore_window

Sets the maximum document window size for a rescore operation, which reorders results using a more complex scoring model, to limit the resource consumption of the process. The default value is 10,000.

index.max_result_window

Sets the maximum result window size for paged queries to prevent high-overhead paging, such as from: 100000, size: 10. The default value is 10,000.

index.max_script_fields

Sets the maximum number of script fields in a single search request to limit the resource consumption of script calculations. The default value is 32.

index.max_shingle_diff

Controls the maximum difference in length between shingles in a Shingle tokenizer to prevent the tokenizer from generating too many combined words. The default value is 3.

index.max_terms_count

Sets the maximum number of conditions that can be added in a single query or aggregation to prevent an overly large condition list from causing insufficient memory and affecting performance. The default value is 65,536.

index.max_prefix_length

Sets the maximum length of a prefix query to prevent excessively long prefixes from affecting query performance. The default value is 0, which means there is no limit.

index.max_wildcard_length

Sets the maximum length of a wildcard query to limit its complexity and avoid full table scans. The default value is 10.

Processor whitelist

Before a document is indexed in Elasticsearch, you can use a processor to transform, clean, enrich, or filter its content. The following table lists the Ingest Pipelines Processor whitelist for retrieval-augmented generation applications (Version 8.17).

Processor type

Processor

Description

Basic data operations

append

Appends one or more values to a field, which must be an array.

set

Sets the value of a field. If the field does not exist, it is created. If the field exists, its value is overwritten.

remove

Deletes one or more fields.

rename

Renames a field.

drop

Discards the entire document. The document is not indexed.

fail

Forcibly terminates the processor, breaks the pipeline, and returns an error.

pipeline

Calls another Ingest Pipeline.

terminate

Immediately terminates the pipeline execution. Subsequent processors do not run.

Data type conversion and formatting

convert

Converts the data type of a field.

bytes

Converts a byte size string, such as 1kb, to the number of bytes (an integer).

date

Parses a time string and sets it as the value of @timestamp or another field.

sort

Sorts the array values in a field.

String and text processing

trim

Removes leading and trailing whitespace from a string.

split

Splits a string into an array based on a separator.

join

Merges an array into a string.

uppercase

Converts a string to uppercase.

lowercase

Converts a string to lowercase.

gsub

Replaces content in a string using a regular expression.

urldecode

Decodes a URL-encoded string.

html_strip

Removes HTML tags.

Structured parsing

dissect

Extracts fields using simple pattern matching.

grok

Parses unstructured text using regular expression patterns.

kv

Extracts fields from a key-value pair string.

json

Parses a JSON string into an object field.

csv

Parses a CSV string into multiple fields.

Geo and network information processing

geo_grid

Converts latitude and longitude to a geographic grid. This is often used for aggregations.

ip_location

Resolves an IP address to a geographic location. This requires integration with GeoIP.

network_direction

Determines the direction of network traffic.

user_agent

Parses a User-Agent string to extract information such as the browser and operating system.

uri_parts

Parses a URI to extract information such as the protocol, host, and path.

Data security

redact

Masks sensitive data in a field, such as hiding the middle digits of a phone number.

fingerprint

Generates a unique hash value for the content of a field. This is used to remove duplicates.

Advanced processors

attachment

Extracts text from file attachments, such as PDF or Word files.

circle

Converts a circular area (centroid and radius) into a geographic shape for geographic queries.

community_id

Generates a community ID for a network flow. This is used for network traffic analysis.

dot_expand

Expands a field name with dot notation into a nested object.

for_each

Executes a set of processors for each element in an array.

Script and AI processing

script

Uses the Painless scripting language to write complex logic, such as conditional statements, mathematical operations, and dynamic field generation.

Note

To use custom scripts, you must submit a ticket to add them to the whitelist.

inference

Calls a deployed machine learning model, such as an NLP model, for inference.

Query whitelist

Query section whitelist

The following table lists the parameters supported by the outermost JSON object of a search request body in retrieval-augmented generation applications (Version 8.17).

Parameter

Description

retriever

Builds a retrieval-augmented generation (RAG) system. It supports hybrid retrieval (keyword and vector) and simplifies the semantic search process.

terminate_after

Limits the maximum number of hits.

min_score

Filters results based on a minimum score.

_source

Controls whether the _source field is returned.

stored_fields

Returns stored fields.

query

The main query condition.

post_filter

Filters results after the query. This does not affect aggregations.

knn

Performs an approximate nearest neighbor (ANN) search for vectors.

script_fields

Returns calculated fields from a script.

Note

To use custom scripts, you must submit a ticket to add them to the whitelist.

indices_boost

Assigns weights to different indexes.

aggs / aggregations

Performs aggregation and analysis.

highlight

Highlights matching content.

rescore

A rescoring mechanism.

slice

Querying index segments

collapse

Merges and removes duplicates from multiple repetitive or similar query conditions.

pit

Performs a point in time search.

docvalue_fields

Returns doc_values fields.

fields

Returns specified fields.

search_after

Performs deep paging.

runtime_mappings

Defines runtime fields to dynamically calculate field values at query time without re-indexing. This is often used for log parsing, field transformation, and sensitive data masking.

Note

Some type values are not yet supported. To use them, submit a ticket to add them to the whitelist.

Query type whitelist

The following query types are supported by retrieval-augmented generation applications (Version 8.17).

  • Compound queries

    Query type

    Description

    bool

    Combines multiple query clauses. It supports Boolean logic such as must and should.

    boosting

    Lowers the score of documents that meet a certain condition, but still returns them.

    constant_score

    Gives all matching documents the same score. This is often used in filtering scenarios.

    dis_max

    Takes the highest score from multiple queries as the final score to avoid excessively high score accumulation.

    function_score

    Provides full control over the scoring of documents. It can be combined with scripts, randomness, and other factors for scoring.

  • Full-text queries

    Query type

    Description

    intervals

    Provides precise control over the position and order of words for complex text pattern matching.

    match

    A standard full-text search that tokenizes the input text and then performs a match.

    match_bool_prefix

    Matches a prefix. This is used in auto-completion scenarios.

    match_phrase

    Performs a phrase match, which requires words to be in the same order and adjacent.

    match_phrase_prefix

    Performs a phrase match and supports prefix matching for the last word.

    combined_fields

    Merges multiple fields into a virtual field for searching to improve cross-field relevance.

    multi_match

    Executes a match query on multiple fields.

    query_string

    Supports complex query syntax, such as AND, OR, NOT, +, and -.

    simple_query_string

    A simplified version of query_string with more user-friendly syntax and better fault tolerance.

    common

    Optimizes searches on long text by automatically distinguishing between high-frequency and low-frequency keywords.

  • Geo queries

    Query type

    Description

    geo_distance

    Filters geographic data that falls within a circular area defined by a centroid (latitude and longitude) and a distance range.

  • Shape queries

    Query type

    Description

    shape

    Determines whether a geographic shape in a document, such as a polygon or line, has a certain spatial relationship with a specified query shape, such as intersects or contains.

  • Joining queries

    Query type

    Description

    nested

    Used for exact matching of nested objects.

  • Match all

    Query type

    Description

    match_all

    Matches all documents in an index.

  • Span queries

    Query type

    Description

    span_containing

    Matches a span query that is contained within another span query.

    span_first

    Matches a span query and requires it to appear within the first N words of a field.

    span_multi

    Enables fuzzy matching for span_term. It is usually used with wildcard, regexp, or prefix.

    span_near/span_gap

    Matches multiple words and requires them to appear within a certain distance of each other, such as allowing three words in between. The order can be controlled.

    span_not

    Excludes the results of one span query from the results of another span query.

    span_or

    Matches if any of multiple span queries match.

    span_term

    Matches an exact word. It is similar to a term query but is used in a span query chain.

    span_within

    Matches a span query that is completely within the range of another span query.

  • Vector queries

    Query type

    Description

    knn

    The primary way that Elasticsearch implements dense vector search. It finds the k most similar documents by calculating the similarity between vectors, such as cosine similarity.

    sparse_vector

    Stores keyword weights to improve computational efficiency.

    text_expansion

    Expands a query into a sparse vector to improve recall rate.

  • Specialized queries

    Query type

    Description

    distance_feature

    Boosts the score of documents that meet a condition based on a time or geographic distance.

    script

    Allows you to use the Painless scripting language to write custom Boolean logic to determine whether a document matches.

    Note

    To use custom scripts, you must submit a ticket to add them to the whitelist.

    script_score

    script_score is part of function_score and lets you fully customize the _score of a document for personalized sorting.

    Note

    To use custom scripts, you must submit a ticket to add them to the whitelist.

    pinned

    Allows you to manually specify that certain document IDs must appear in the search results and control their positions.

  • Term-level queries

    Query type

    Description

    exists

    Matches documents that contain the field and have a non-null value.

    fuzzy

    Matches a term with a similar spelling, based on the Levenshtein edit distance.

    ids

    Finds documents by their exact _id.

    prefix

    Matches a term that starts with a specified prefix.

    range

    Matches a range of numeric, date, or string values.

    regexp

    Matches a term using a regular expression.

    term

    Matches an exact term. The match must be exact and is case-sensitive.

    terms

    Matches if the field value is any of the values in a given list.

    wildcard

    Supports fuzzy matching with * (any character) and ? (a single character).

  • Hybrid retrieval (Retriever rrf)

    Query type

    Description

    rank_rrf

    Enables Reciprocal Rank Fusion (RRF) sorting in a search request.

Aggregation query type whitelist

The following aggregation query types are supported by retrieval-augmented generation applications (Version 8.17).

  • Bucket aggregations

    Classification

    Query type

    Description

    Basic types

    terms

    Groups documents by field value, such as by status or region.

    multi_terms

    Groups documents by a combination of multiple fields (composite primary key).

    histogram

    Groups documents by numerical interval.

    date_histogram

    Groups documents by time interval, such as per hour or per day.

    auto_date_histogram

    Automatically selects an appropriate time interval.

    variable_width_histogram

    A variable-width histogram used to handle uneven data distribution or many extreme values.

    range

    Groups documents by custom numeric range.

    date_range

    Groups documents by custom date range.

    ip_range

    Group by IP address segment.

    ip_prefix

    Grouped by IP prefix.

    Geospatial types

    geo_distance

    Groups documents by distance from a point, such as 0-1km or 1-5km.

    geohash_grid

    Groups documents by Geohash grid (latitude and longitude grid).

    geotile_grid

    Groups documents by Google S2 or XYZ tile grid (map tiles).

    geohex_grid

    Groups documents by hexagonal grid.

    Filtering and sampling types

    filter

    Creates a bucket that contains only the documents that match the query.

    filters

    Creates multiple buckets, each corresponding to a filter condition.

    missing

    Groups all documents with missing fields into one bucket.

    sampler

    Randomly samples a portion of documents for subsequent aggregation.

    diversified_sampler

    Ensures that the values of a specific field are not repeated during sampling.

    random_sampler

    Performs random sampling based on a sample.

    Nested types

    nested

    Performs aggregation within a nested object.

    reverse_nested

    Returns from a nested document to the root document context.

    children

    Aggregates child documents in a parent-child relationship.

    parent

    Aggregates parent documents from child documents in a parent-child relationship.

    Advanced analysis types

    significant_terms

    Finds terms with high significance.

    significant_text

    Performs significance analysis on a text field.

    rare_terms

    Finds low-frequency values.

    frequent_item_sets

    Finds combinations of fields that frequently appear together.

    categorize_text

    Automatically categorizes unstructured text.

    Other types

    adjacency_matrix

    Builds a cross-matrix of Boolean conditions for debugging filters.

    aggs/aggregations

    Syntax keywords for nested aggregations.

    composite

    Buckets documents by a combination of dimensions. It supports deep paging and is used to export full data.

    global

    Creates a global bucket that ignores all filters.

    time_series

    Performs efficient aggregation on time series data.

  • Metrics aggregations

    Classification

    Query type

    Description

    Basic statistics types

    avg

    Calculates the average value.

    sum

    Calculates the sum.

    min

    Minimum value.

    max

    (maximum)

    value_count

    Counts the number of field values, including duplicates.

    cardinality

    Counts the number of unique values.

    Advanced statistics types

    stats

    Returns count, min, max, avg, and sum in a single response.

    extended_stats

    Adds sum_of_squares, variance, and std_deviation to the data returned by stats.

    percentiles

    Calculates percentiles.

    percentile_ranks

    The percentile rank corresponding to a given value.

    median_absolute_deviation

    The median absolute deviation, which measures data dispersion.

    Geospatial types

    geo_bounds

    Calculates the bounding box of geographic points.

    geo_centroid

    Calculates the centroid of geographic points.

    geo_line

    Connects geographic points to form a line.

    cartesian_bounds

    The bounding box of Cartesian coordinates.

    cartesian_centroid

    The centroid of Cartesian coordinates.

    Data structure types

    top_hits

    Returns the top matching documents within a bucket.

    top_metrics

    Returns the values of other fields that correspond to the optimal value of a certain metric.

    boxplot

    Returns the five values required for a box plot chart: min, q1, median, q3, and max.

    Other types

    matrix_stats

    Calculates the statistical relationships between multiple fields, such as the mean, covariance, and correlation coefficient matrix.

    rate

    Calculates the growth rate or rate.

    string_stats

    Calculates statistics for a text field, such as length and character types.

    t_test

    Performs a t-test to determine if there is a significant difference between the means of two sets of data.

    weighted_avg

    Weighted average value.