Match phrase query

更新时间: 2026-05-07 03:20:32

A match phrase query is similar to a match query, except that a match phrase query evaluates the positions of tokens. A row meets the query conditions only if the order and positions of the tokens in the row match the order and positions of the tokens that are contained in the keyword. If the tokenization method for the field that you want to query is fuzzy tokenization, match phrase query is performed at a lower latency than wildcard query.

Scenarios

You can use match phrase query to search for data that contains a specific phrase in which the words are arranged in a specific order. You can use match phrase query together with tokenization to perform full-text search in specific scenarios, such as big data analysis, content search, and personalized recommendation. For example, you can query sentences that contain a specific phrase in content search and locate messages that are arranged in a specific sequence in chat records.

Features

A match phrase query uses approximate matches to query data and evaluates the positions of tokens. For example, the value in the column of the TEXT type is "Hangzhou West Lake Scenic Area" in a row and the keyword you specify is "Hangzhou Scenic Area". Tablestore returns the row when you use match query. However, when you use match phrase query, Tablestore does not return the row. The distance between "Hangzhou" and "Scenic Area" in the keyword is 0, but the distance in the column of this row is 2 because the two words "West" and "Lake" exist between "Hangzhou" and "Scenic Area".

When you use match phrase query, you must specify the name of the field that you want to query and the keyword. A row meets the query conditions only if the order and positions of the tokens in the row match the order and positions of the tokens that are contained in the keyword.

When you perform a match phrase query, you can specify the weight that you want to assign to the field that you want to query to calculate the BM25-based keyword relevance score, the columns that you want to return, whether to return the total number of rows that meet the query conditions, and the method that is used to sort the returned rows.

API operation

You can call the Search or ParallelScan operation and set the query type to MatchPhraseQuery to perform a match phrase query.

Parameters

Parameter

Description

fieldName

The name of the field that you want to match.

You can perform match phrase queries on TEXT fields.

text

The keyword that is used to match the value of the field when you perform a match phrase query.

If the field that you want to match is a TEXT field, the keyword is tokenized into multiple tokens based on the analyzer type that you specify when you create the search index. If you do not specify the analyzer type when you create the search index, single-word tokenization is performed. For more information, see Tokenization.

For example, if you perform a match phrase query by using the phrase "this is", "..., this is tablestore" and "this is a table" are returned. "this table is ..." or "is this a table" is not returned.

query

The type of the query. Set the query parameter to matchPhraseQuery.

offset

The position from which the current query starts.

limit

The maximum number of rows that you want the current query to return.

To query only the number of rows that meet the query conditions without specific data, set the limit parameter to 0.

getTotalCount

Specifies whether to return the total number of rows that meet the query conditions. The default value of this parameter is false, which specifies that the total number of rows that meet the query conditions is not returned.

If you set this parameter to true, the query performance is compromised.

weight

The weight that you want to assign to the field that you want to query to calculate the BM25-based keyword relevance score. This parameter is used in full-text search scenarios. If you specify a higher weight for the field that you want to query, the BM25-based keyword relevance score for the field is higher. The value of this parameter is a positive floating point number.

This parameter does not affect the number of rows that are returned. However, this parameter affects the BM25-based keyword relevance scores of the query results.

tableName

The name of the data table.

indexName

The name of the search index.

columnsToGet

Specifies whether to return all columns of each row that meets the query conditions. You can specify the returnAll and columns fields for the columnsToGet parameter.

The default value of the returnAll field is false, which specifies that not all columns are returned. In this case, you can use the columns field to specify the columns that you want to return. If you do not specify the columns that you want to return, only the primary key columns are returned.

If you set the returnAll field to true, all columns are returned.

Notes

Search Index provides only basic BM25 relevance scoring and does not support custom relevance models.

Methods

You can use the Tablestore console, Tablestore CLI, or Tablestore SDKs to perform a match phrase query.

Before you perform a match phrase query, make sure that the following preparations are made:

Use the Tablestore console

You can use the Tablestore console to perform a match phrase query.

  1. Go to the Index Management tab.

    1. Log on to the Table Store console.

    2. In the top navigation bar, select a resource group and a region.

    3. On the Overview page, click the instance name or click Instance Management in the Actions column.

    4. On the Instance Details tab, in the Data Table List tab, click the data table name or click Index Management in the Actions column.

  2. On the Index Management tab, find the target Search Index and click Search in the Actions column.

  3. In the Search dialog box, specify the query conditions.

    1. By default, all columns are returned. To return specific columns, turn off Retrieve All Columns and enter the column names, separated by commas.

      Note

      By default, Table Store returns the primary key columns of the data table.

    2. Select a logical operator: And, Or, or Not.

      If you select And, the query returns data that meets all specified conditions. If you select Or, the query returns data that meets at least one of the specified conditions. If you select Not, the query returns data that does not meet the specified conditions.

    3. Select an indexed field of type Text and click Add.

    4. Set the query type for the indexed field to Match Phrase Query (MatchPhraseQuery) and enter the value to search for.

    5. By default, sorting is disabled. To sort the results by a specific field, turn on Enable Sorting, add the sort field, and configure the sort order.

    6. By default, aggregation is disabled. To perform statistical aggregation on a specific field, turn on Enable Aggregation, add the field for aggregation, and configure the aggregation settings.

  4. Click OK.

    The query results are displayed on the Index Management tab.

Use the Tablestore CLI

You can use the Tablestore CLI to run the search command to query data by using search indexes. For more information, see Search index.

  1. Run the search command to use the search_index search index to query data and return all indexed columns of each row that meets the query conditions.

    search -n search_index --return_all_indexed
  2. Enter the query conditions as prompted:

    {
        "Offset": -1,
        "Limit": 10,
        "Collapse": null,
        "Sort": null,
        "GetTotalCount": true,
        "Token": null,
        "Query": {
            "Name": "MatchPhraseQuery",
            "Query": {
                "FieldName": "col_text",
                "Text": "this is"
            }
        }
    }

Use Tablestore SDKs

You can perform a match phrase query by using the following Tablestore SDKs: Tablestore SDK for Java, Tablestore SDK for Go, Tablestore SDK for Python, Tablestore SDK for Node.js, Tablestore SDK for .NET, and Tablestore SDK for PHP. In this example, Tablestore SDK for Java is used.

The following sample code provides an example on how to query the rows in which the value of the Col_Text column matches the whole phrase "hangzhou shanghai" in order in the data table:

/**
 * Query the rows in which the value of the Col_Text column matches the whole phrase "hangzhou shanghai" in order in the data table. Tablestore returns the total number of rows that meet the query conditions and the specific data of some of these rows. 
 * @param client
 */
private static void matchPhraseQuery(SyncClient client) {
    SearchQuery searchQuery = new SearchQuery();
    MatchPhraseQuery matchPhraseQuery = new MatchPhraseQuery(); // Set the query type to MatchPhraseQuery. 
    matchPhraseQuery.setFieldName("Col_Text"); // Specify the name of the column to query. 
    matchPhraseQuery.setText("hangzhou shanghai"); // Specify the keyword that you want to match. 
    searchQuery.setQuery(matchPhraseQuery);
    searchQuery.setOffset(0); // Set the offset parameter to 0. 
    searchQuery.setLimit(20); // Set limit to 20 to return up to 20 rows. 
    //searchQuery.setGetTotalCount(true); // Specify that the total number of matched rows is returned. 

    SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery); 
    // You can configure the columnsToGet parameter to specify the columns to return or specify that all columns are returned. If you do not configure this parameter, only the primary key columns are returned. 
    //SearchRequest.ColumnsToGet columnsToGet = new SearchRequest.ColumnsToGet();
    //columnsToGet.setReturnAll(true); // Specify that all columns are returned. 
    //columnsToGet.setColumns(Arrays.asList("ColName1","ColName2")); // Specify the columns that you want to return. 
    //searchRequest.setColumnsToGet(columnsToGet);

    SearchResponse resp = client.search(searchRequest);
    //System.out.println("TotalCount: " + resp.getTotalCount()); // Specify that the total number of matched rows instead of the number of returned rows is displayed. 
    System.out.println("Row: " + resp.getRows());
}

Billing

In VCU mode (formerly reserved mode), Search Index queries consume VCU compute resources. In CU mode (formerly pay-as-you-go mode), they consume read throughput. For more information, see Search Index metering and billing.

FAQ

References

上一篇: Match query 下一篇: Tokenization
阿里云首页 表格存储 相关技术圈