Use the aggs clause to compute statistics over your search results without inspecting individual documents. Common use cases include:
How many products fall into each price range?
What is the total sales amount per vendor?
What is the highest-rated item in each category?
Syntax
{
"aggs": [
{
"group_key": "<field>",
"agg_fun": ["<func1>", "<func2>"],
"agg_filter": "<filter_expression>",
"agg_range": [<number1>, <number2>],
"max_group": <number>,
"order_by": "count"
}
]
}Parameters
| Parameter | Required | Type | Description |
|---|---|---|---|
group_key | Yes | STRING or INTEGER attribute field | The field to group results by. Must be an attribute field defined in schema.json. |
agg_fun | Yes | Array of function strings | One or more aggregation functions to apply. See Aggregation functions. |
agg_filter | No | Logical expression | Filters documents before aggregation. Uses the same syntax as the filter clause. |
agg_range | No | [number1, number2] | Restricts aggregation to a numeric range. One range per aggs clause. Not supported for STRING fields. |
max_group | No | Integer; default: 1000 | Maximum number of groups to return. Keep this at or below 10000 to avoid out of memory (OOM) errors on the Query Result Searcher (QRS) worker. |
order_by | No | "count" | Sorts groups by document count. If omitted, groups are sorted in lexicographic order of the group_key values. |
Aggregation functions
All functions are specified in the agg_fun array. You can combine multiple functions in a single aggs clause.
| Function | Description |
|---|---|
count() | Number of documents in each group |
sum(<field>) | Sum of the field values in each group |
max(<field>) | Maximum field value in each group |
min(<field>) | Minimum field value in each group |
distinct_count(<field>) | Enables the semi-exact statistics feature. Uses the HyperLogLog (HLL) algorithm to compute an approximate count of distinct field values. Accuracy exceeds 99% in most cases. |
Examples
Simple aggregation
Sum the price field, grouped by group_id:
{
"aggs": [
{
"group_key": "group_id",
"agg_fun": ["sum(price)"]
}
]
}Aggregation results are returned in the facet node of the response. To access the facet node, set format to fulljson in the config clause.
{
"result": {
"facet": [
{
"key": "group_id",
"items": [
{ "value": 43, "sum": 81 },
{ "value": 63, "sum": 91 }
]
}
]
}
}Each item in items corresponds to one group: value is the group_key value, and sum is the result of the sum(price) function.
Multiple aggregation functions
Apply sum(), max(), and min() in one clause:
{
"aggs": [
{
"group_key": "company_id",
"agg_fun": ["sum(id)", "max(id)", "min(id)"]
}
]
}Aggregation across multiple fields
Use multiple aggs objects to aggregate different fields independently in one request:
{
"aggs": [
{
"group_key": "group_id",
"agg_fun": ["sum(price)"]
},
{
"group_key": "company_id",
"agg_fun": ["count()"]
}
]
}Filtered aggregation
Aggregate only documents where price > 100:
{
"aggs": [
{
"group_key": "group_id",
"agg_fun": ["sum(price)"],
"agg_filter": "price > 100"
}
]
}Distinct count (approximate)
Count the number of distinct brand values per company_id:
{
"aggs": [
{
"group_key": "company_id",
"agg_fun": ["distinct_count(brand)"]
}
]
}Usage notes
Fields used in
group_keyand aggregation functions must be attribute fields declared inschema.json.Aggregation results are returned to the
facetnode on the Searcher worker. Setformattofulljsonin the config clause to include this node in the response.Accurate statistics are guaranteed for up to 100,000 documents per partition. If a partition contains more than 100,000 matching documents, results may be approximate because the engine applies performance limits during distributed execution. To raise this limit, adjust the maximum document count in the cluster configuration.
Setting
max_groupabove10,000increases memory consumption on the QRS worker and may cause an OOM error.