Summaries and highlights

更新时间:
复制 MD 格式

When you query data, you can set highlighting parameters to return text snippets that contain the search query and highlight the query. The summary and highlighting feature is supported only for Text type fields.

Prerequisites

Precautions

  • The summary and highlight feature is supported in Tablestore Python SDK V6.0.0 and later. To use this feature, make sure that you have the latest version of the Python SDK. For more information about the release history of the Python SDK, see Python SDK release history.

  • When you use the summary and highlight feature with a MatchQuery or MatchPhraseQuery, a search query can be highlighted with multiple pre_tags and post_tags.

  • The summary and highlight feature is not supported for a MatchPhraseQuery if the tokenization type of the Text field is set to maximum semantic tokenization.

  • A search query may not be highlighted if sharding splits the query terms within the source text.

Parameters

Parameter

Description

highlight_encoder

The encoding method for the original content of highlighted fragments. Valid values:

  • PLAIN_MODE (default): The original content is displayed without encoding.

  • HTML_MODE: The original content of the highlighted fragments is HTML-escaped. For example, < is escaped to &lt;, > to &gt;, " to &quot;, ' to &#x27;, and / to &#x2F;. Use this format for web display.

highlight

The highlight parameters for a field. You can set this parameter only for fields that are part of a keyword query in SearchQuery.

highlight_parameters

number_of_fragments

The maximum number of highlighted fragments to return. Set this parameter to 1.

fragment_size

The length of each fragment. The default value is 100.

Important

The actual length of the returned fragments may not be exactly equal to this value.

pre_tag

The prefix tag for highlighting a search query, such as <em> or <b>. The default value is <em>. Customize the prefix tag as needed. The supported character set for `pre_tag` includes < > " ' /, a-z, A-Z, and 0-9.

post_tag

The postfix tag for highlighting a search query, such as </em> or </b>. The default value is </em>. Customize the postfix tag as needed. The supported character set for `post_tag` includes < > " ' /, a-z, A-Z, and 0-9.

fragments_order

The sorting rule for fragments when multiple fragments are returned for a highlighted field.

  • TEXT_SEQUENCE (default): The fragments are sorted by their order of appearance in the text.

  • SCORE: The fragments are sorted by the scores of the query hits.

Example

The following example shows how to use MatchQuery to find data where the Col_Text column matches hangzhou shanghai and how to highlight the search query in the results. The Col_Text column is a Text type.

def match_query_with_highlight(client):
    query = MatchQuery('Col_Text', 'hangzhou shanghai')
    highlight_parameter = HighlightParameter('Col_Text',None,None,pre_tag='',post_tag='')
    highlight_clause = Highlight([highlight_parameter],HighlightEncoder.PLAIN_MODE)
    search_response = client.search(
        '<TABLE_NAME>', '<SEARCH_INDEX_NAME>',
        SearchQuery(query, limit=100, get_total_count=True,highlight=highlight_clause),
        ColumnsToGet(return_type=ColumnReturnType.ALL)
    )
    print('----- Print Highlight Result:')
    search_hits = search_response.search_hits
    print('search hit count:%d' % len(search_hits))

    for search_hit in search_hits:
        print('\t score is %.6f' % search_hit.score)
        for highlight_field in search_hit.highlight_result.highlight_fields:
            print('\t\t highlight:%s:%s' % (highlight_field.field_name, highlight_field.field_fragments))

    print('********** End HighlightQuery **********')

References