OpenSearch Retrieval Engine Edition supports SQL hints — optimizer directives embedded in SQL comments — to fine-tune query execution beyond what the default query planner produces. Hints do not cause statement errors when invalid; the system silently ignores them. To confirm whether a hint took effect, check the execution plan after running the query.
If a hint does not behave as expected, see Conflict resolution for the rules the system applies when hints conflict or are invalid.
Prerequisites
Before you begin, ensure that you have:
OpenSearch Retrieval Engine Edition with HA3 version V3.7.3 or later for HASH_JOIN, LOOKUP_JOIN, NORMAL_AGG, NO_INDEX, and ATTR hints
HA3 version V3.8.0 or later for LOCAL_PARALLEL hints
Syntax
Place hints inside SQL block comments (/*+ ... */) immediately after SELECT or after the table name in the FROM clause.
SELECT [/*+ HintName(params) */]
{ * | projectItem [, projectItem ]* }
FROM tableExpression [/*+ HintName(params) */]
HintName: [a-zA-Z][a-zA-Z_]*
Params:
Identifier[, Identifier]
| Identifier=Identifier[, Identifier=Identifier]
Identifier: [a-zA-Z_][a-zA-Z_0-9]*Multiple hints can appear in the same comment block, separated by commas:
SELECT /*+ LOOKUP_JOIN(tj_relation), NORMAL_AGG(distributionCheck='false', propScope='all') */
...Built-in hints
OpenSearch Retrieval Engine Edition provides four built-in hints for controlling join algorithms and aggregation behavior.
HASH_JOIN
Forces the HashJoin operator for the specified tables, performing an inner join.
Syntax: HASH_JOIN(tableName1, tableName2, ...)
Example:
SELECT
/*+ HASH_JOIN(tj_relation) */ -- force hash join at this level
*
FROM
(
SELECT
/*+ HASH_JOIN(tj_item_raw) */ -- force hash join for the inner subquery
*
FROM
(
SELECT
sum(tj_item.id) as sum0
FROM
tj_item
GROUP BY
tj_item.id
) B
JOIN tj_item_raw ON B.sum0 = tj_item_raw.id
) D
JOIN tj_relation ON D.sum0 = tj_relation.item_idBoth join operations use the hash join algorithm.
LOOKUP_JOIN
Forces the LookupJoin operator for the specified tables, performing an inner join.
Syntax: LOOKUP_JOIN(tableName1, tableName2, ...)
Example:
SELECT
/*+ LOOKUP_JOIN(tj_relation) */ -- force lookup join at this level
*
FROM
(
SELECT
/*+ LOOKUP_JOIN(tj_item_raw) */ -- force lookup join for the inner subquery
*
FROM
(
SELECT
sum(tj_item.id) as sum0
FROM
tj_item
GROUP BY
tj_item.id
) B
JOIN tj_item_raw ON B.sum0 = tj_item_raw.id
) D
JOIN tj_relation ON D.sum0 = tj_relation.item_idBoth join operations use the lookup join algorithm.
NORMAL_AGG
Runs the aggregation you define in the first phase of the query, instead of splitting it across phases.
Syntax: NORMAL_AGG(distributionCheck='false'|'true', propScope='all')
| Parameter | Description | Default |
|---|---|---|
distributionCheck | Whether to check data distribution. If set to 'true' and the group key is not the partition key, the hint is ignored. | — |
propScope | Scope of hint propagation. Set to 'all' to push the hint to the current aggregation node and all downstream nodes. Omit to push only to the nearest aggregation node. | Nearest node |
Example:
SELECT
/*+ LOOKUP_JOIN(tj_relation), NORMAL_AGG(distributionCheck='false', propScope='all') */
tj_relation.price
FROM
(
SELECT
*
FROM
(
SELECT
sum(tj_item.id) as sum0
FROM
tj_item
GROUP BY -- (1) first aggregation
tj_item.id
) B
JOIN tj_item_raw ON B.sum0 = tj_item_raw.id
) D
JOIN tj_relation ON D.sum0 = tj_relation.item_id
GROUP BY -- (2) second aggregation, runs as normal aggregation
tj_relation.priceThe aggregation runs as a single-phase normal aggregation, not split across two phases.
NO_INDEX
Disables index optimization for specific fields during the filter phase. Use this to force full scans on fields where index use degrades performance.
Syntax: NO_INDEX(tableName='<table>', fields='<field1>, <field2>, ...')
| Parameter | Description |
|---|---|
tableName | Name of the table containing the target fields |
fields | Comma-separated list of fields for which to disable index optimization |
Example:
SELECT
/*+ LOOKUP_JOIN(tj_relation), NO_INDEX(tableName='tj_relation', fields='pk') */
*
FROM
(
SELECT
/*+ NO_INDEX(tableName='tj_item', fields='shop_id, reserve_price') */
*
FROM
(
SELECT
SUM(tj_item.id) as sum0
FROM
tj_item
/*+ NO_INDEX(tableName='tj_item', fields='shop_id') */
WHERE
tj_item.id = 100
AND tj_item.shop_id = 500 -- shop_id is not indexed here
GROUP BY
tj_item.id
) B
JOIN tj_item_raw ON B.sum0 = tj_item_raw.id
) D
JOIN tj_relation /*+ NO_INDEX(tableName='tj_relation', fields='item_id') */
ON D.sum0 = tj_relation.item_id
WHERE
tj_relation.item_id = 900 AND tj_relation.pk = 100 -- pk and item_id are not indexed hereATTR hints
ATTR hints pass auxiliary information directly to operators to accelerate queries. Use them when querying multiple fields or when you need fine-grained control over scan limits, aggregation group sizes, or join batch sizes.
SCAN_ATTR
Passes configuration to the scan operator.
Syntax: SCAN_ATTR(param1='value1', param2='value2', ...)
| Parameter | Description | Default |
|---|---|---|
localLimit | Maximum number of rows the scan operator returns | — |
batchSize | Number of rows returned per batch. Use with localLimit. | Auto-calculated based on table size |
nestTableJoinType | Join method for nested tables. Valid values: inner, left. | left |
hashValues | Comma-separated field values used to calculate hash values and identify target partitions. For partition-based queries. | — |
partitionIds | IDs of partitions to query. Set to -1 to scan all partitions. For partition-based queries. | — |
Place the SCAN_ATTR hint next to the table name it applies to (recommended):
SELECT
company_id, company_name
FROM
company /*+ SCAN_ATTR(localLimit='3', batchSize='2', nestTableJoinType='inner') */Alternatively, place it after SELECT:
SELECT /*+ SCAN_ATTR(localLimit='3', batchSize='2', nestTableJoinType='inner') */
company_id, company_name
FROM
companyPartition-based queries (for debugging only):
Use partition-based queries for debugging only. Do not use them in production.
-- Query by hash values: the system calculates hash values from a1 and a2
-- to identify the target partitions
SELECT /*+ SCAN_ATTR(hashValues='a1,a2') */ company_id, company_name
FROM company
-- Query by partition IDs: queries data from partition 1 and partition 2
SELECT /*+ SCAN_ATTR(partitionIds='1,2') */ company_id, company_name
FROM companyAGG_ATTR
Passes configuration to the aggregation operator. Place AGG_ATTR only after the SELECT keyword.
Syntax: AGG_ATTR(param1='value1', param2='value2', ...)
| Parameter | Description | Default |
|---|---|---|
groupKeyLimit | Maximum number of data groups the aggregation operator processes | 200000 |
stopExceedLimit | Whether to return an error when groupKeyLimit is reached | true |
propScope | Scope of hint effect. Default is the current node only. | Current node |
Example:
SELECT /*+ AGG_ATTR(groupKeyLimit='3', stopExceedLimit='false') */
company_id, company_name
FROM
company
GROUP BY
company_id, company_nameJOIN_ATTR
Passes configuration to the join operator. Place JOIN_ATTR only after the SELECT keyword.
Syntax: JOIN_ATTR(param1='value1', param2='value2', ...)
| Parameter | Description | Default |
|---|---|---|
lookupTurncateThreshold | Maximum number of results after a lookup inner join. The system stops joining once this limit is reached. | — |
lookupBatchSize | Number of rows per lookup join batch | 500 |
propScope | Scope of hint effect. Default is the current node only. | Current node |
defaultValue | Default field values for left outer joins when right-table values are unavailable. Specify one default per data type, e.g., 'INTEGER:10,VARCHAR:aa'. | — |
Control result size with `lookupTurncateThreshold`:
-- Return at most 30 results after the lookup inner join
SELECT /*+ JOIN_ATTR(lookupTurncateThreshold='30', lookupBatchSize='50') */
id, daogou.company_id, company_name
FROM daogou
JOIN company ON daogou.id = company.company_idCombine JOIN_ATTR and SCAN_ATTR in the same statement:
SELECT /*+ JOIN_ATTR(lookupTurncateThreshold='3', lookupBatchSize='2') */
id, daogou.company_id, company_name
FROM
daogou /*+ SCAN_ATTR(localLimit='5', batchSize='1') */
JOIN
company
ON
daogou.id = company.company_idSet default values for left outer joins:
-- When right-table values are missing, use 10 for INTEGER fields and 'aa' for VARCHAR fields
SELECT /*+ JOIN_ATTR(defaultValue='INTEGER:10,VARCHAR:aa') */
id, daogou.company_id, company_name
FROM daogou
JOIN company ON daogou.id = company.company_idLOCAL_PARALLEL
Requires HA3 version V3.8.0 or later.
LOCAL_PARALLEL enables parallel query execution on the Searcher node. Specify it after the table name to set the number of threads for that table. The system automatically identifies which operations can run in parallel.
LOCAL_PARALLEL is incompatible with the parallel query method used in HA3 versions earlier than V3.8.0. Do not use both at the same time.
Syntax: LOCAL_PARALLEL(tableName='<table>', parallelNum='<n>')
| Parameter | Description |
|---|---|
tableName | Table to parallelize. Use catalogName.dbName.tableName format; catalogName and dbName are optional and default to system values. |
parallelNum | Number of parallel threads |
Examples:
-- Simple SELECT with parallel scan
SELECT i1
FROM t1 /*+ LOCAL_PARALLEL(tableName='t1', parallelNum='2') */
-- SELECT with aggregation
SELECT i1, COUNT(*)
FROM t1 /*+ LOCAL_PARALLEL(tableName='t1', parallelNum='2') */
GROUP BY i1
-- JOIN with parallel scan on one table
SELECT t1.id, t2.id
FROM tj_item /*+ LOCAL_PARALLEL(tableName='tj_item', parallelNum='2') */ AS t1
JOIN tj_shop AS t2
ON t1.id = t2.idIf parallel execution does not improve performance as expected, examine the query trace and execution plan to identify the bottleneck operation.
Conflict resolution
When multiple hints conflict, the system applies these rules:
| Scenario | System behavior |
|---|---|
Same hint type, different propScope values on the same node | The hint with the smallest propScope value takes effect |
Same hint type, same propScope values on the same node | None of the conflicting hints are executed |
| Hint may cause incorrect execution plans | The system does not execute the hint |
To verify whether a hint took effect, check the execution plan after running the query. If a hint is not reflected in the execution plan, review the conflict rules above to determine which condition applies.