Sorting policies

更新时间:
复制 MD 格式

Search engines require high retrieval performance. To meet this demand, OpenSearch provides a two-stage sorting process: rough sort and fine sort. The rough sort stage acts as a preliminary filter, quickly identifying high-quality documents from the initial search results. The top N documents from this stage then proceed to the fine sort stage, where they are re-scored in detail. Consequently, rough sort significantly impacts performance, while fine sort primarily affects ranking quality. Therefore, a rough sort expression should be simple and effective, using only the most critical factors from the fine sort stage. You configure both rough sort and fine sort using sort expressions.

A sort expression (ranking formula) lets you customize how search results are ranked. You can sort results by specifying the expression in a query. A sort expression supports basic operations (arithmetic, relational, logical, bitwise, and conditional), mathematical functions, and ranking features. For common applications such as forums and news sites, OpenSearch provides expression templates. You can select a suitable template and modify it to create your own expression.

Before configuring relevance ranking (fine sort), understand the default sorting policy. After a query and other clauses retrieve documents, they enter the sorting stage. For more information, see the sort clause. If you do not specify a sort clause or if the sort clause explicitly specifies RANK, the relevance scoring phase is triggered.

The design of your rough and fine sort expressions depends on your search requirements. For details about how to design ranking factors in several typical scenarios, see the Relevance in Practice article in Best Practices.

Note

Sort expressions require numeric values or numeric fields for all basic operations, such as arithmetic, relational, logical, and conditional operations. Most functions do not support string-type fields.

Basic operations

Operation

Operator

Description

Unary

-

Negation operator. Negates the value of an expression. Example: -1 or -max(width).

Arithmetic

+, -, *, /

Example: width / 10.

Relational

==, !=, >, <, >=, <=

Example: width>=400.

Logical

and, or, !

Example: width>=400 and height >= 300, or !(a > 1 and b < 2).

Bitwise

&, |, ^

Example: 3 & (price ^ pubtime) + (price | pubtime).

Conditional

if(cond, thenValue, elseValue)

If cond is non-zero, the expression returns thenValue. Otherwise, it returns elseValue. For example, if(2, 3, 5) returns 3, and if(0, 3, 5) returns 5. Note: This operation does not support string fields, such as fields of the LITERAL or TEXT type. The return value must be within the int32 range.

in

i in (value1, value2, …, valuen)

If i is in the set (value1, value2, …, valuen), the expression returns 1. Otherwise, it returns 0. For example, 2 in (2, 4, 6) returns 1, and 3 in (2, 4, 6) returns 0.

Mathematical functions

Function

Description

max(a, b)

Returns the greater of a and b.

min(a, b)

Returns the lesser of a and b.

ln(a)

Returns the natural logarithm of a.

log2(a)

Returns the base-2 logarithm of a.

log10(a)

Returns the base-10 logarithm of a.

sin(a)

Sine function.

cos(a)

Cosine function.

tan(a)

Tangent function.

asin(a)

Arcsine function.

acos(a)

Arccosine function.

atan(a)

Arctangent function.

ceil(a)

Rounds a up to the nearest integer. For example, ceil(4.2) returns 5.

floor(a)

Rounds a down to the nearest integer. For example, floor(4.6) returns 4.

sqrt(a)

Returns the square root of a. For example, sqrt(4) returns 2.

pow(a,b)

Returns a to the power of b. For example, pow(2, 3) returns 8.

now()

Returns the current time in seconds since the Epoch (00:00:00 UTC, January 1, 1970).

random()

Returns a random value in the range [0, 1].

Built-in functions

OpenSearch provides a rich set of built-in functions, such as those for Location Based Services (LBS), text, and timeliness. You can combine these functions in sort expressions for powerful relevance ranking.

Cava plugin

Cava is a high-performance programming language developed by the OpenSearch engine team based on LLVM. Its syntax is similar to Java, and its performance is comparable to C++. Cava is an object-oriented programming language that supports just-in-time (JIT) compilation and includes various security checks to ensure program robustness. You can use Cava and its libraries to create custom sorting plug-ins in OpenSearch. Compared to standard sort expressions, Cava plug-ins offer the following advantages:

  • Greater customization: Cava provides more extensive syntax features than expressions, such as for loops and function and class definitions, allowing you to implement custom business logic.

  • Easier maintenance: Sorting plug-ins written in Cava are more readable and easier to maintain than complex expressions.

  • Lower learning curve: Cava's syntax is similar to Java, making it easy for Java developers to learn.

Note: Cava plug-ins are supported only for exclusive applications.

Procedure

This example demonstrates how to configure rough sort and fine sort by using a text relevance sorting function.

1. Create a rough sort policy. In the OpenSearch console, go to **Sort Configuration** > **Policy Management**, and click **Create**. Enter a **Policy Name**, set **Scope** to **Rough Sort**, set **Type** to **Expression**, and click **Next**.

5555Select static_bm25 as the Scoring Characteristics and set the weight to 10. A weight of 10 means the score is multiplied by 10 during the calculation. You can also select a search field and set a weight for it. The field must be an attribute field and of a numeric type, such as INT, DOUBLE, or FLOAT. The score of field value * weight is added to the total sort score.

For example, select the sale_price field from Search Field and set its weight to 0.08.

After you complete the configuration, you are taken to the policy management page.

2. Create a fine sort policy. In the OpenSearch console, go to **Sort Configuration** > **Policy Management**, and click **Create**. Enter a **Policy Name**, set **Scope** to **Fine Sort**, set **Type** to **Expression**, and click **Next**.

5555In the Sort Configuration step, select a field from the Field drop-down list, enter the sort expression text_relevance(brand) in the editor, and then click Completed. After you complete the configuration, you are taken to the Sort Configuration page.

3. View the sorting effect. On the Search Test page, set the first_rank_name parameter to the name of your rough sort policy, such as test_1. Set the second_rank_name parameter to the name of your fine sort policy, such as test_2. Turn on the Show Sort Details switch to view score details for each document and function.

Note

Document scoring occurs in two stages: rough sort and fine sort. After a query retrieves and filters documents, they enter the rough sort stage. The rough sort expression selects documents with higher scores. Then, the top N results are passed to the fine sort stage for detailed scoring based on the fine sort expression. Finally, the optimal results are returned. The scoring logic is as follows:

  • If only a rough sort policy is configured, the document score is (10000 + the result of the rough sort expression). The total score is capped at 20,000.

  • If only a fine sort policy is configured, the document score is (10000 + the result of the fine sort expression). The total score has no upper limit.

  • If both rough sort and fine sort policies are configured, the final score for documents that enter the fine sort stage is (10000 + the result of the fine sort expression). The final score for the remaining documents is (10000 + the result of the rough sort expression), and this score is capped at 20,000.

  • You can create multiple rough sort and fine sort rules, but a query can use only one rough sort rule and one fine sort rule at a time.

Important
  • The first_rank_name parameter supports only one sort expression name.

  • The second_rank_name parameter supports only one sort expression name.

SDK configuration examples

Java SDK example:

// Set the rough and fine sort expressions. This example uses "default".
Rank rank =new Rank();
rank.setFirstRankName("default"); // The name of the rough sort policy.
rank.setSecondRankName("default"); // The name of the fine sort policy.
rank.setReRankSize(5); // The number of documents for fine sorting.

PHP SDK example:

// Specify the rough sort expression.
$params->setFirstRankName('default');
// Specify the fine sort expression.
$params->setSecondRankName('default');

Note:

  • If you specify sorting policies in your code, they override the default policies set in the console.

  • To view sorting details in code:

    Method: Add the format:fulljson parameter to the config clause.

    In the response, sortExprValues contains the document's score.

    "sortExprValues": [
        "10000.0399786383"
    ],
    "property": {
    }

    sortExprValues is an array that contains the values of the sorting fields in the sort clause. Example:

    sort=-price;-RANK

    In this case, sortExprValues is [price, document score].

    If sort is not set, the value defaults to the document score.