The cluster configuration covers both online (ha3) and offline (build_service) settings. The offline configuration controls builders and mergers — the components responsible for constructing and maintaining the search index.
How merge policies work
As the offline build service processes documents, it writes index data into immutable units called *segments*. Over time, many small segments accumulate, which degrades search performance and wastes storage occupied by deleted documents. The merge process periodically combines smaller segments into larger ones, reclaiming space and keeping the index compact.
Two configuration sections control offline merge behavior: customized_merge_config and segment_customize_metrics_updater.
customized_merge_config
customized_merge_config defines one or more named merge policies, each with its own trigger conditions and merge parameters. Multiple policies can be active at the same time. When several policies are triggered simultaneously, they run in the order defined in the configuration.
Merge policy parameters
In addition to merge_config, each policy supports the following parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
period | string | — | Controls when the merge policy runs. Accepts two formats: a fixed interval in seconds (e.g., "period=1800" runs every 30 minutes), or a specific time of day (e.g., "period": "daytime=13:00" runs at 1:00 PM). |
need_wait_alter_field | boolean | true | Whether to pause this policy while a dynamic field addition is in progress. Set to false to skip the wait — the policy then automatically switches to the align_version merge policy. |
merge_parallel_num | integer | — | The number of concurrent merge processes per partition. |
Do not name a custom merge policy alter_field. That name is reserved for the dynamic field addition feature and such names are disallowed.Built-in policies
The system ships with five built-in policies covering the most common merge scenarios:
| Policy name | Trigger | Purpose |
|---|---|---|
full | Manual or full build | Consolidates all segments after a full build to optimize the index. |
large_segment_reclaim | Every 1,800 seconds (30 min) | Merges large segments with many deleted documents to reclaim wasted space. |
segment_merge | Every 900 seconds (15 min) | Merges small-to-medium segments with high delete rates to maintain index health. |
large_segment_merge | Every 3,600 seconds (1 hour) | Merges large segments by prioritizing those with more valid documents to reduce fragmentation. |
small_segment_merge | No fixed period | Merges small segments to keep the total segment count low. |
The default customized_merge_config value:
{
"full": {
"merge_config": {
"keep_version_count": 40,
"merge_strategy": "optimize",
"merge_strategy_param": "after-merge-max-segment-count=20",
"merge_thread_count": 4
}
},
"large_segment_reclaim": {
"merge_config": {
"keep_version_count": 40,
"merge_strategy": "priority_queue",
"merge_strategy_params": {
"input_limits": "max-segment-size=20480",
"output_limits": "max-merged-segment-size=13312;max-total-merged-size=15360",
"strategy_conditions": "priority-feature=delete-doc-count#desc;conflict-segment-count=2;conflict-delete-percent=8"
},
"merge_thread_count": 4
},
"period": "period=1800"
},
"segment_merge": {
"merge_config": {
"keep_version_count": 40,
"merge_strategy": "priority_queue",
"merge_strategy_params": {
"input_limits": "max-segment-size=12288",
"output_limits": "max-merged-segment-size=13312;max-total-merged-size=15360",
"strategy_conditions": "priority-feature=valid-doc-count#asc;conflict-segment-count=2;conflict-delete-percent=10"
},
"merge_thread_count": 4
},
"period": "period=900"
},
"large_segment_merge": {
"merge_config": {
"keep_version_count": 40,
"merge_strategy": "priority_queue",
"merge_strategy_params": {
"input_limits": "max-segment-size=12288",
"output_limits": "max-merged-segment-size=15360;max-total-merged-size=46080",
"strategy_conditions": "priority-feature=valid-doc-count#desc;conflict-segment-count=2;conflict-delete-percent=10"
},
"merge_thread_count": 4
},
"period": "period=3600"
},
"small_segment_merge": {
"merge_config": {
"keep_version_count": 40,
"merge_strategy": "priority_queue",
"merge_strategy_params": {
"input_limits": "max-segment-size=1536",
"output_limits": "max-merged-segment-size=1536;max-total-merged-size=3072",
"strategy_conditions": "priority-feature=valid-doc-count#asc;conflict-segment-count=2;conflict-delete-percent=20"
},
"merge_thread_count": 4
}
}
}segment_customize_metrics_updater
segment_customize_metrics_updater defines custom metric updaters that run during the merge and build phases to generate per-segment statistics.
The indexlib includes a built-in lifecycle updater. When configured, it computes the minimum and maximum values of specified fields for each segment, then assigns tags based on the tagging method defined in lifecycle_param. These tags feed into load policies and time_series_merge policies.
The default value is an empty array ([]).