An index schema defines the structure of an index table: which fields to store, how to index them, and how to compress the data. Each index table has one schema, and the schema determines how OpenSearch Retrieval Engine Edition stores and retrieves your data.
Key concepts
| Concept | Description |
|---|---|
| Field | A named, typed attribute of a document. Fields are the building blocks of every index table. |
| Inverted index | Maps words to the documents that contain them (word → Doc1, Doc2, ..., DocN). Use it for keyword search and full-text retrieval. |
| Forward index | Maps document IDs to their field values (DocID → term1, term2, ..., termN). Use it for sorting, filtering, and statistics. |
| Summary index | Stores field values keyed by document ID for fast result display. Use it to render search results without re-fetching raw data. When compression is enabled in the schema, OpenSearch Retrieval Engine Edition uses zlib to compress the summary index and decompresses it when reading. |
Forward index subtypes
| Subtype | Data quantity per field | Query performance | Data updatable |
|---|---|---|---|
| Single-value | Fixed (one value; STRING type values are variable-length) | High | Yes |
| Multi-value | Variable (multiple values) | Lower | No |
Supported field data types
The following types are supported for forward index fields:
| Type | Bit width | Signed |
|---|---|---|
| INT8 | 8-bit integer | Yes |
| UINT8 | 8-bit integer | No |
| INT16 | 16-bit integer | Yes |
| UINT16 | 16-bit integer | No |
| INTEGER | 32-bit integer | Yes |
| UINT32 | 32-bit integer | No |
| INT64 | 64-bit integer | Yes |
| UINT64 | 64-bit integer | No |
| FLOAT | 32-bit floating-point | — |
| DOUBLE | 64-bit floating-point | — |
| STRING | String | — |
Schema structure
A schema is a JSON document with the following top-level keys:
{
"file_compress": [ // Compressor definitions referenced by other sections
{
"name": "file_compressor",
"type": "zstd"
},
{
"name": "no_compressor",
"type": ""
}
],
"table_name": "test", // Index table name
"summarys": { // Summary index configuration
"summary_fields": [
"id",
"fb_boolean",
"fb_datetime",
"fb_string",
"fb_decimal",
"fb_bigint",
"fb_text"
],
"parameter": {
"file_compressor": "zstd" // Compressor applied to the summary index
}
},
"indexs": [ // Inverted index definitions
{
"index_name": "id",
"index_type": "PRIMARYKEY64",
"index_fields": "id",
"has_primary_key_attribute": true,
"is_primary_key_sorted": false
},
{
"index_name": "fb_boolean",
"index_type": "STRING",
"index_fields": "fb_boolean",
"file_compress": "file_compressor", // Apply compression to this index field
"format_version_id": 1
},
{
"index_name": "fb_datetime",
"index_type": "STRING",
"index_fields": "fb_datetime",
"file_compress": "file_compressor",
"format_version_id": 1
},
{
"index_name": "fb_string",
"index_type": "STRING",
"index_fields": "fb_string"
},
{
"index_name": "fb_text",
"index_type": "TEXT",
"index_fields": "fb_text"
}
],
"attributes": [ // Forward index (attribute field) definitions
{
"field_name": "id",
"file_compress": "no_compressor"
},
{
"field_name": "fb_boolean",
"file_compress": "file_compressor"
},
{
"field_name": "fb_datetime",
"file_compress": "no_compressor"
},
{
"field_name": "fb_string",
"file_compress": "file_compressor"
},
{
"field_name": "fb_decimal",
"file_compress": "no_compressor"
},
{
"field_name": "fb_bigint",
"file_compress": "no_compressor"
}
],
"fields": [ // Field type definitions shared across all indexes
{
"user_defined_param": {},
"field_name": "id",
"field_type": "INT64",
"compress_type": "equal"
},
{
"field_name": "fb_boolean",
"field_type": "STRING",
"compress_type": "uniq"
},
{
"field_name": "fb_datetime",
"field_type": "STRING",
"compress_type": "uniq"
},
{
"user_defined_param": {
"multi_value_sep": "," // Delimiter for multi-value fields
},
"field_name": "fb_string",
"field_type": "STRING",
"compress_type": "equal",
"multi_value": true
},
{
"field_name": "fb_decimal",
"field_type": "DOUBLE"
},
{
"field_name": "fb_bigint",
"field_type": "INT64",
"compress_type": "equal"
},
{
"field_name": "fb_text",
"field_type": "TEXT",
"analyzer": "chn_standard" // Text analyzer for full-text indexing
}
]
}Schema parameter reference
`fields` parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
field_name | string | Field name | Required |
field_type | string | Data type: INT8, UINT8, INT16, UINT16, INTEGER, UINT32, INT64, UINT64, FLOAT, DOUBLE, STRING, TEXT | Required |
compress_type | string | Field data compression: equal for single-value fields, uniq for multi-value or STRING fields | Not compressed |
multi_value | bool | Whether the field holds multiple values | false |
user_defined_param.multi_value_sep | string | Delimiter for multi-value fields. Must be a single character; full-width characters are not supported | ^] |
analyzer | string | Text analyzer for TEXT fields (e.g., chn_standard). Required when field_type is TEXT | — |
`indexs` parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
index_name | string | Index name | Required |
index_type | string | Index type: PRIMARYKEY64, STRING, TEXT | Required |
index_fields | string | The field this index is built on | Required |
has_primary_key_attribute | bool | Whether the primary key has a corresponding attribute field | false |
is_primary_key_sorted | bool | Whether the primary key index is sorted | false |
file_compress | string | Compressor name (references file_compress[].name). Cannot be set on the primary key index | Not compressed |
format_version_id | integer | Index format version | — |
`attributes` parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
field_name | string | The field this attribute is built on | Required |
file_compress | string | Compressor name (references file_compress[].name) | Not compressed |
For more information about configuring an index table, see Configure an index table.
Add an index table
Prerequisites
Before you begin, ensure that you have:
An active OpenSearch Retrieval Engine Edition instance
At least one data source configured for the instance
Steps
On the instance details page, choose Configuration Center > Index Schema in the left-side navigation pane, then click Create Index Table.
Configure the Index Table, Data Source, and Data Shards parameters.
Configure fields. Multi-value field delimiters: The default delimiter is
^]. Set a custom delimiter inuser_defined_param.multi_value_sep. Delimiters must be a single character; full-width characters are not supported. Attribute field compression: By default, attribute fields are not compressed. To compress an attribute field, selectfile_compressor. Related topics:ImportantWhen compressing attribute fields, modify the index loading method to reduce the performance impact. On the instance details page, choose O&M Center > Deployment Management. Click the Searcher worker, then open the Searcher Worker Configurations panel and go to the Online Table Configurations tab.
Field data compression:
Field type Default compress_typeSingle-value fields equalMulti-value fields or STRING uniqNot compressed Leave blank Configure indexes. Index field compression: By default, index fields are not compressed. To compress an index field, select
file_compressor.Note- The primary key index cannot be compressed. - When compressing index fields, modify the index loading method. On the instance details page, choose O&M Center > Deployment Management. Click the Searcher worker, then open the Searcher Worker Configurations panel and go to the Online Table Configurations tab.
Click Save Version. In the dialog, enter an optional description and click Publish.
To view the updated topology, choose O&M Center > Deployment Management in the left-side navigation pane.
To apply the new index table to the cluster, choose O&M Center > O&M Management, click Update Configurations, and set Trigger Reindexing to Push Configurations and Trigger Reindexing.
To monitor reindexing progress, choose O&M Center > Change History and click the Data Source Changes tab.
After reindexing completes, the new index table is ready for queries.
Only one primary key field is allowed per index table.
At least one field must have Search Result Display selected.
TEXT fields require an analysis method. Multi-value TEXT fields are not supported.
Only one primary key index is allowed per index table.
If your cluster has 2 replicas, set Data Shards to 2. The number of Searcher workers must be greater than (replicas x data shards); otherwise, the index table cannot be used.
A single data shard can hold up to 600 million documents, with a combined maximum of 2.1 billion across all shards. The index size of a single shard cannot exceed 300 GB. For real-time updates, the update transactions per second (TPS) per shard cannot exceed 4,000. Using the
addcommand, the TPS can reach 10,000.
Modify an index table
Index table versions
Each new index table starts with two versions:
| Version name | Status | Description |
|---|---|---|
index_config_v1 | In Use or Unused | The initial configuration. Status is In Use after you push the configuration and rebuild indexes; Unused otherwise. |
index_config_edit | Modifying | The version currently being edited. |
Subsequent published versions are named incrementally: index_config_v2, index_config_v3, and so on. Add a description to each version to tell them apart.
Edit and publish a new version
Find the version in Modifying state and click Modify.
You can also switch to developer mode to edit the schema JSON directly. In
cluster.json, thecustomized_merge_configandsegment_customize_metrics_updaterkeys are available for index merging configuration. Thesegment_customize_metrics_updaterkey is only supported on new instances.Make your changes, then click Save Version.
Find the version in Modifying state, click Publish, enter a description, and click OK. The system creates a new version in Unused state.
To apply the changes to the cluster, choose O&M Center > O&M Management, click Update Configurations, and set Trigger Reindexing to Push Configurations and Trigger Reindexing.
Delete a version: You can delete versions in Unused state.
View a version: Click View to open the configuration page in read-only mode. Both administrator mode and developer mode are available for viewing.
Delete an index table
You can delete an index table that has no version in In Use state.
If the index table has a version in In Use state, unsubscribe from it first:
Choose O&M Center > Deployment Management. Click the index table, then click Cancel Subscription on the Effective Online tab.
Choose Configuration Center > Index Schema. Find the index table and click Delete in the Actions column.
After canceling a subscription on the Deployment Management page, delete the index table from the index schema. Leaving an unsubscribed index table in the schema may degrade query performance on your online clusters.
Usage notes
A data source is required when creating an index table. If no data source exists, add one before creating the index table.
The index table name cannot be changed after creation.
An index table with a version in In Use state cannot be deleted.
Each index table can have only one version in Modifying state at a time.