Aggregation

更新时间:
复制 MD 格式

You can perform aggregation operations to obtain the minimum, maximum, sum, and average values, and the count and distinct count of rows. You can also perform aggregation operations to group results by field value, range, geographical location, or filter, and perform nested queries. You can perform multiple aggregation operations for complex queries.

Procedure

The following figure shows the complete aggregation procedure.

fig_agg_pro

The server queries the data that meets the query conditions and performs aggregation on the data based on the request. Therefore, a request that requires aggregation is more complex to process than a request that does not require aggregation.

Background information

The following table describes the aggregation methods.

Method

Description

Minimum value

The aggregation method that can be used to return the minimum value of a field. This method can be used in a similar manner as the SQL MIN function.

Maximum value

The aggregation method that can be used to return the maximum value of a field. This method can be used in a similar manner as the SQL MAX function.

Sum

The aggregation method that can be used to return the sum of all values for a numeric field. This method can be used in a similar manner as the SQL SUM function.

Average value

The aggregation method that can be used to return the average of all values for a numeric field. This method can be used in a similar manner as the SQL AVG function.

Count

The aggregation method that can be used to return the total number of values for a specified field or the total number of rows in a search index. This method can be used in a similar manner as the SQL COUNT function.

Distinct count

The aggregation method that can be used to return the number of distinct values for a field. This method can be used in a similar manner as the SQL COUNT(DISTINCT) function.

Percentile statistics

A percentile value indicates the relative position of a value in a dataset. For example, when you collect statistics about the response time of each request during routine O&M of your system, you must analyze the response time distribution by using percentiles such as p25, p50, p90, and p99.

TopRows

The aggregation method that can be used to obtain statistics of top rows that are sorted based on specific fields for each group.

Group by field value

The aggregation method that can be used to group query results based on field values. The values that are the same are grouped together. The identical value of each group and the number of identical values in each group are returned.

Note

The calculated number may be different from the actual number if the number of values in a group is very large.

Group by range

The aggregation method that can be used to group query results based on the value ranges of a field. Field values that are within a specific range are grouped together. The number of values in each range is returned.

Group by geographical location

The aggregation method that can be used to group query results based on distances from geographical locations to a central point. Query results in distances that are within a specific range are grouped together. The number of values in each range is returned.

Group by filter

The aggregation method that can be used to filter the query results and group them together to obtain the number of results that match each filter. Results are returned in the order in which the filters are specified.

Query by histogram

The aggregation method that can be used to group query results based on specific data intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned.

Nesting

Prerequisites

Minimum value

The aggregation method that can be used to return the minimum value of a field. This method can be used in a similar manner as the SQL MIN function.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE types are supported.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_min',
                        'type' => AggregationTypeConst::AGG_MIN,
                        'body' => array(
                            'field_name' => 'long',
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Maximum value

The aggregation method that can be used to return the maximum value of a field. This method can be used in a similar manner as the SQL MAX function.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE types are supported.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_max',
                        'type' => AggregationTypeConst::AGG_MAX,
                        'body' => array(
                            'field_name' => 'long',
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Sum

The aggregation method that can be used to return the sum of all values for a numeric field. This method can be used in a similar manner as the SQL SUM function.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE types are supported.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_sum',
                        'type' => AggregationTypeConst::AGG_SUM,
                        'body' => array(
                            'field_name' => 'long',
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Average value

The aggregation method that can be used to return the average of all values for a numeric field. This method can be used in a similar manner as the SQL AVG function.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE types are supported.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_avg',
                        'type' => AggregationTypeConst::AGG_AVG,
                        'body' => array(
                            'field_name' => 'long',
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Count

The aggregation method that can be used to return the total number of values for a specified field or the total number of rows in a search index. This method can be used in a similar manner as the SQL COUNT function.

Note

You can use the following methods to query the total number of rows in a search index or the total number of rows that meet the query conditions:

  • Use the count feature of aggregation and specify count(*) in the request.

  • Use the query feature to obtain the number of rows that meet the query conditions. Set setGetTotalCount to true in the query. Use MatchAllQuery to obtain the total number of rows in a search index.

You can use the name of a column as the value of the count expression to query the number of rows that contain the column in a search index. This method is suitable for scenarios that involve sparse columns.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, BOOLEAN, KEYWORD, GEO_POINT, and DATE types are supported.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_count',
                        'type' => AggregationTypeConst::AGG_COUNT,
                        'body' => array(
                            'field_name' => 'long',
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Distinct count

The aggregation method that can be used to return the number of distinct values for a field. This method can be used in a similar manner as the SQL COUNT(DISTINCT) function.

Note

The number of distinct values is an approximate number.

  • If the total number of rows before the distinct count feature is used is less than 10,000, the calculated result is close to the exact value.

  • If the total number of rows before the distinct count feature is used is greater than or equal to 100 million, the error rate is approximately 2%.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, BOOLEAN, KEYWORD, GEO_POINT, and DATE types are supported.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_distinct_count',
                        'type' => AggregationTypeConst::AGG_DISTINCT_COUNT,
                        'body' => array(
                            'field_name' => 'boolean',
                            'missing' => false
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Percentile statistics

A percentile value indicates the relative position of a value in a dataset. For example, when you collect statistics about the response time of each request during routine O&M of your system, you must analyze the response time distribution by using percentiles such as p25, p50, p90, and p99.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE types are supported.

    percentiles

    The percentiles such as p50, p90, and p99. You can specify one or more percentiles.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'aggs' => array(
                'aggs' => array(
                    array(
                        'name' => 'agg_percentiles',
                        'type' => AggregationTypeConst::AGG_PERCENTILES,
                        'body' => array(
                            'field_name' => 'long',
                            'percentiles' => array(60, 80, 100),
                            'missing' => 0
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

TopRows

The aggregation method that can be used to obtain statistics of top rows that are sorted based on specific fields for each group.

After you group query results by using GroupBy, you can query the rows in each group. This method can be used in a similar manner as ANY_VALUE(field) in MySQL.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    limit

    The number of top rows that you want to return for each group.

    sort

    The sorting rules for the rows in each group. The specified number of top rows that are sorted based on the sorting rules are returned for each group.

    columns_to_get

    The fields that you want to return. Only fields in search indexes are supported. ARRAY, DATE, GEOPOINT, and NESTED fields are not supported.

    The value of this parameter is the same as the value of columns_to_get in the request. You need to only specify columns_to_get in the request.

    Note

    When you query the rows that are obtained from the results of an aggregation operation in each group, the returned results contain only the primary key information if the search index contains the Nested, Geopoint, or Array field. To obtain the required field, you must query the data table.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
              'aggs' => array(
            'aggs' => array(
                    array(
                        'name' => 'avg_top_rows',
                        'type' => AggregationTypeConst::AGG_TOP_ROWS,
                        'body' => array(
                            'limit' => 2,
                            'sort' => array(
                                'sorters' => array(
                                    array(
                                        'field_sort' => array(
                                            'field_name' => 'long',
                                            'order' => SortOrderConst::SORT_ORDER_DESC
                                        )
                                    )
                                )
                            )
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $agg_results = $response["aggs"]["agg_results"];
    
    print json_encode($agg_results, JSON_PRETTY_PRINT);

Group by field value

The aggregation method that can be used to group query results based on field values. The values that are the same are grouped together. The identical value of each group and the number of identical values in each group are returned.

Note

The calculated number may be different from the actual number if the number of values in a group is very large.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, BOOLEAN, KEYWORD, and DATE types are supported.

    sort

    The sorting rules for groups. By default, groups are sorted based on the number of items in the groups in descending order. If you configure multiple sorting rules, the groups are sorted based on the order in which the rules are configured. Supported parameters:

    • Sort by value in alphabetical order.

    • Sort by value in reverse alphabetical order.

    • Sort by row count in ascending order.

    • Sort by row count in descending order.

    • Sort by the values that are obtained from sub-aggregation results in ascending order.

    • Sort by the values that are obtained from sub-aggregation results in descending order.

    size

    The number of groups that you want to return. Default value: 10. Maximum value: 2000. If the number of groups exceeds 2,000, only the first 2,000 groups are returned.

    sub_aggs and sub_group_bys

    The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.

    • Scenario

      Query the number of products in each category, and the maximum and minimum product prices in each category.

    • Method

      Group query results by product category to obtain the number of products in each category. Then, perform two sub-aggregation operations to obtain the maximum and minimum product prices in each category.

    • Sample result

      • Fruits: 5. The maximum price is USD 2. The minimum price is USD 0.5.

      • Toiletries: 10. The maximum price is USD 13. The minimum price is USD 0.1.

      • Electronic devices: 3. The maximum price is USD 1,160. The minimum price is USD 310.

      • Other products: 15. The maximum price is USD 130. The minimum price is USD 11.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'group_bys' => array(
                'group_bys' => array(
                    array(
                        'name' => 'group_by_GROUP_BY_FIELD',
                        'type' => GroupByTypeConst::GROUP_BY_FIELD,
                        'body' => array(
                            'field_name' => 'boolean',
                            'size' => 3,
                            'min_doc_count' => 0,
                            'sort' => array(
                                'sorters' => array(
                                    array(
                                        'group_key_sort' => array(
                                            'order' => SortOrderConst::SORT_ORDER_DESC
                                        ),
                                    ),
                                )
                            ),
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $group_by_results = $response["group_bys"]["group_by_results"];
    
    print json_encode($group_by_results, JSON_PRETTY_PRINT);

Group by range

The aggregation method that can be used to group query results based on the value ranges of a field. Field values that are within a specific range are grouped together. The number of values in each range is returned.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE types are supported.

    ranges[from, to)

    The value ranges that are used for grouping.

    The value range can start from Double.MIN_VALUE and end with Double.MAX_VALUE.

    sub_aggs and sub_group_bys

    The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.

    For example, after you group query results by sales volume and by province, you can obtain the province that has the largest proportion of sales volume in a specified range. You must specify a value for GroupByField in GroupByRange to perform this query.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'group_bys' => array(
                'group_bys' => array(
                    array(
                        'name' => 'group_by_GROUP_BY_RANGE',
                        'type' => GroupByTypeConst::GROUP_BY_RANGE,
                        'body' => array(
                            'field_name' => 'long',
                            'ranges' => array(
                                array(
                                    'from' => 1,
                                    'to' => 3
                                ),
                                array(
                                    'from' => 3,
                                    'to' => 6
                                ),
                                array(
                                    'from' => 6,
                                    'to' => 10
                                )
                            )
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $group_by_results = $response["group_bys"]["group_by_results"];
    
    print json_encode($group_by_results, JSON_PRETTY_PRINT);

Group by geographical location

The aggregation method that can be used to group query results based on distances from geographical locations to a central point. Query results in distances that are within a specific range are grouped together. The number of values in each range is returned.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used for the aggregation operation. Only the GEOPOINT type is supported.

    origin(lat, lon)

    The longitude and latitude of the central point.

    lat specifies the latitude of the central point. lon specifies the longitude of the central point.

    ranges[from, to)

    The distance ranges that are used for grouping. Unit: meters. The values of from and to are of the DOUBLE type.

    The value range can start from Double.MIN_VALUE and end with Double.MAX_VALUE.

    sub_aggs and sub_group_bys

    The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'group_bys' => array(
                'group_bys' => array(
                    array(
                        'name' => 'group_by_GROUP_BY_GEO_DISTANCE',
                        'type' => GroupByTypeConst::GROUP_BY_GEO_DISTANCE,
                        'body' => array(
                            'field_name' => 'geo',
                            'origin' => array(
                                'lat' => 5,
                                'lon' => 6
                            ),
                            'ranges' => array(
                                array(
                                    'from' => 0.,
                                    'to' => 1000.
                                ),
                                array(
                                    'from' => 10000.,
                                    'to' => 100000.
                                ),
                                array(
                                    'from' => 100000.,
                                ),
                            )
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $group_by_results = $response["group_bys"]["group_by_results"];
    
    print json_encode($group_by_results, JSON_PRETTY_PRINT);

Group by filter

The aggregation method that can be used to filter the query results and group them together to obtain the number of results that match each filter. Results are returned in the order in which the filters are specified.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    filters

    The filters that can be used for the query. Results are returned in the order in which the filters are specified.

    sub_aggs and sub_group_bys

    The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'group_bys' => array(
                'group_bys' => array(
                    array(
                        'name' => 'group_by_GROUP_BY_FILTER',
                        'type' => GroupByTypeConst::GROUP_BY_FILTER,
                        'body' => array(
                            'filters' => array(
                                array(
                                    'query_type' => QueryTypeConst::TERM_QUERY,
                                    'query' => array(
                                        'field_name' => 'boolean',
                                        'term' => false
                                    )
                                ),
                                array(
                                    'query_type' => QueryTypeConst::TERM_QUERY,
                                    'query' => array(
                                        'field_name' => 'boolean',
                                        'term' => true
                                    )
                                )
                            )
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $group_by_results = $response["group_bys"]["group_by_results"];
    
    print json_encode($group_by_results, JSON_PRETTY_PRINT);

Query by histogram

The aggregation method that can be used to group query results based on specific data intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned.

  • Parameters

    Parameter

    Description

    name

    The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.

    field_name

    The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE types are supported.

    interval

    The data interval that is used to obtain aggregation results.

    field_range[min,max]

    The range that is used together with the interval parameter to limit the number of groups. The value that is calculated by using the formula cannot exceed 2,000.

    min_doc_count

    The minimum number of rows. If the number of rows in a group is less than the minimum number of rows, the aggregation results for the group are not returned.

    missing

    The default value for the field that is used to perform the aggregation operation on a row when the field value is empty.

    • If you do not specify a value for missing, the row is ignored.

    • If you specify a value for missing, the value of this parameter is used as the field value of the row.

  • Example

    $request = array(
        'table_name' => 'myTableName',
        'index_name' => 'myIndexName',
        'search_query' => array(
            'offset' => 0,
            'limit' => 0,
            'get_total_count' => true,
            'query' => array(
                'query_type' => QueryTypeConst::MATCH_ALL_QUERY
            ),
            'group_bys' => array(
                'group_bys' => array(
                    array(
                        'name' => 'group_by_GROUP_BY_HISTOGRAM',
                        'type' => GroupByTypeConst::GROUP_BY_HISTOGRAM,
                        'body' => array(
                            'field_name' => 'long',
                            'interval' => 3,
                            'missing' => 0,
                            'min_doc_count' => 0,
                            'field_range' => array(
                                'min' => 2,
                                'max' => 10,
                            ),
                            'sort' => array(
                                'sorters' => array(
                                    array(
                                        'row_count_sort' => array(
                                            'order' => SortOrderConst::SORT_ORDER_ASC
                                        )
                                    )
                                )
                            )
                        )
                    ),
                ),
            )
        ),
        'columns_to_get' => array(
            'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
        ),
    );
    
    $response = $this->otsClient->search($request);
    $group_by_results = $response["group_bys"]["group_by_results"];
    
    print json_encode($group_by_results, JSON_PRETTY_PRINT);

Nesting

You can use nesting to perform sub-aggregation operations in a group. For example, you can perform nesting aggregation operations up to two levels.

  • GroupBy + SubGroupBy: Items are grouped by province and by city to obtain data for each city in each province.

  • GroupBy + SubAggregation: Items are grouped by province and the maximum value of a metric is calculated to obtain the maximum value of a metric for each province.

Note

To balance performance and complexity, only a specific number of nesting layers are allowed. For more information, see Search index limits.

$request = array(
    'table_name' => 'myTableName',
    'index_name' => 'myIndexName',
    'search_query' => array(
        'offset' => 0,
        'limit' => 0,
        'get_total_count' => true,
        'query' => array(
            'query_type' => QueryTypeConst::MATCH_ALL_QUERY
        ),
        'group_bys' => array(
            'group_bys' => array(
                array(
                    'name' => 'group_by_GROUP_BY_FIELD',
                    'type' => GroupByTypeConst::GROUP_BY_FIELD,
                    'body' => array(
                        'field_name' => 'boolean',
                        'size' => 3,
                        'min_doc_count' => 0,
                        'sort' => array(
                            'sorters' => array(
                                array(
                                    'group_key_sort' => array(
                                        'order' => SortOrderConst::SORT_ORDER_DESC
                                    ),
                                ),
                            )
                        ),
                        'sub_aggs' => array(
                            'aggs' => array(
                                array(
                                    'name' => 'groupSum',
                                    'type' => AggregationTypeConst::AGG_SUM,
                                    'body' => array(
                                        'field_name' => 'long',
                                        'missing' => 0
                                    )
                                ),
                            ),
                        ),
                    )
                ),
            ),
        )
    ),
    'columns_to_get' => array(
        'return_type' => ColumnReturnTypeConst::RETURN_ALL_FROM_INDEX,
    ),
);

$response = $this->otsClient->search($request);
$group_by_results = $response["group_bys"]["group_by_results"];

print json_encode($group_by_results, JSON_PRETTY_PRINT);