Use aggregate clauses to compute group statistics, such as counts, sums, maximums, and minimums, over search results without viewing individual documents.
Syntax
Aggregate clause syntax:
group_key:field, range:number1~number2, agg_fun:func1#func2, max_group:number2,
agg_filter:filter_clause, max_group:number
Parameters:
|
Parameter |
Type |
Required |
Valid value |
Default value |
Description |
|
group_key:field |
field: an attribute field |
Yes |
INT, LITERAL, INT_ARRAY, or LITERAL_ARRAY fields. For INT_ARRAY or LITERAL_ARRAY fields, repeated items are counted individually. |
The attribute field to group by for statistics collection. |
|
|
agg_fun |
Yes |
The built-in functions count(), sum(id), max(id), min(id), and distinct_count(id). |
Supported functions: count(), sum(id), max(id), min(id), and distinct_count(id). Separate multiple functions with number signs (#). The sum(), max(), and min() functions support arithmetic expressions across multiple fields. |
||
|
range |
No |
Values between number1 and number2, and values greater than number2. STRING fields cannot be aggregated. |
Collects statistics by value ranges for data distribution analysis. Only one range parameter is allowed per aggregate clause. |
||
|
agg_filter |
No |
Filters documents by the specified conditions before aggregation. |
|||
|
agg_sampler_threshold |
INT |
No |
The sampling threshold. Documents ranked above this value are counted sequentially. Documents ranked below it are sampled at intervals defined by agg_sampler_step. |
||
|
agg_sampler_step |
INT |
No |
The sampling interval for documents ranked below agg_sampler_threshold. For sum() and count(), sampled statistics are multiplied by the step size and added to the sequential statistics to produce the final result. |
||
|
max_group |
INT |
No |
1000 |
Maximum number of groups returned. |
Usage notes
-
An aggregate clause is optional.
-
All referenced fields must be configured as attribute fields in the application schema.
-
Aggregate results are returned in the facet node. The agg_fun functions (such as sum() and count()) produce the statistics.
-
Specify multiple group_key parameters to collect statistics for different fields. Separate them with semicolons (;).
Example:
group_key:field1,agg_fun:func1;group_key:field2,agg_fun:func2
-
To display statistics in the response, set the config clause format to full JSON.
-
distinct_count is supported only in exclusive clusters. Add
enable_accurate_statisticsto the kvpairs clause and set it to true. Only facet-node statistics are returned when this feature is enabled. -
count(), max(), min(), and sum() in exclusive clusters also require
enable_accurate_statisticsset to true in the kvpairs clause. -
Accurate statistics are guaranteed for up to 100,000 matching documents. Beyond this limit, results may be approximate. In exclusive clusters, set enable_accurate_statistics to true in the kvpairs clause for improved accuracy.
Examples
-
Query documents containing "Zhejiang University" and group by group_id (sum and max of price) and company_id (count).
query=default:'Zhejiang University'&&aggregate=group_key:group_id,agg_fun:sum(price)#max(price);group_key:company_id,agg_fun:count()Sample return result:
{ status: "OK", result: { searchtime: 0.015634, total: 5, num: 1, viewtotal: 5, items: [ // The return result. { ... } ], facet: [ { key: "group_id", items: [ { value: 43, sum: 81, max: 20, }, { value: 63, sum: 91, max: 50, }, ], }, { key: "company_id", items: [ { value: 13, count: 4, }, { value: 10, count: 1, }, ], }, ], }, errors: [ ], tracer: "", }, -
Query documents containing "Zhejiang University", group by group_id, and calculate sum(price) with sampling (threshold: 10,000, step: 5).
query=default:'Zhejiang University'&&aggregate=group_key:group_id,agg_fun:sum(price), agg_sampler_threshold:10000, agg_sampler_step:5 -
Query documents containing "Zhejiang University", group by group_id, and count documents with group_id values in the range 10–50.
query=default:'Zhejiang University'&&aggregate=group_key:group_id,agg_fun:count(),range:10~50 -
Query documents containing "Zhejiang University", group by group_id, and calculate max(hits + replies) for documents with create_timestamp greater than 1423456781.
query=default:'Zhejiang University'&&aggregate=group_key:group_id,agg_fun:max(hits+replies),agg_filter:create_timestamp>1423456781