GROUPING SETS is an extension of the GROUP BY clause that lets you aggregate query results across multiple grouping dimensions in a single query. Without GROUPING SETS, you would need to execute multiple SELECT statements to achieve the same result.
OpenSearch High-performance Search Edition generates a simpler execution plan for GROUPING SETS queries, which improves query execution performance.
Syntax
SELECT column1, column2, aggregate_function(column3)
FROM table_name
GROUP BY GROUPING SETS (
(grouping_set_1),
(grouping_set_2),
...
)Parameters
| Parameter | Description |
|---|---|
grouping_set | A comma-separated list of columns enclosed in parentheses. An empty set () computes the grand total across all rows. When a grouping set has only one column, the parentheses are optional — GROUPING SETS ((brand)) is equivalent to GROUPING SETS (brand). |
A GROUPING SETS query with the sets (a), (b), and () is semantically equivalent to:
SELECT a, NULL AS b, SUM(c) FROM t GROUP BY a
UNION ALL
SELECT NULL AS a, b, SUM(c) FROM t GROUP BY b
UNION ALL
SELECT NULL AS a, NULL AS b, SUM(c) FROM tExample
Basic aggregation
The following query calculates phone prices aggregated by brand, by size, and as a grand total — all in one statement.
SELECT brand, size, SUM(price) AS sp
FROM phone
GROUP BY GROUPING SETS ((brand), (size), ())The query uses three grouping sets:
(brand)— aggregates prices by brand; thesizecolumn is set to the default value(size)— aggregates prices by size; thebrandcolumn is set to the default value()— computes the grand total; bothbrandandsizeare set to the default value
Result:
BRAND SIZE sp
"" 0.0 32936 <- grand total
"" 1.4 169 <- aggregated by size
"" 4.7 7897
"" 5.0 899
"" 5.5 14684
"" 5.6 5688
"" 5.9 3599
"Huawei" 0 7987 <- aggregated by brand
"Meizu" 0 1299
"Nokia" 0 169
"Xiaomi" 0 899
"OPPO" 0 2999
"Samsung" 0 5688
"Apple" 0 13895Columns not part of a given grouping set are replaced with the default value.
Distinguish aggregation-generated default values with GROUPING()
A problem arises when the original data contains empty strings or null values. For example, if a phone's brand is stored as an empty string "" in the source data, the result above cannot distinguish that legitimate empty string from the default values the engine generates for the brand column when aggregating by size.
Use the GROUPING function to tell them apart:
SELECT brand, size, SUM(price) AS sp, GROUPING(brand, size), GROUPING(brand) AS g1
FROM phone
GROUP BY GROUPING SETS ((brand), (size), ()) LIMIT 20Result:
brand size sp GROUPING(brand,size) g1
"" 0.0 32936 3 1
"" 1.4 169 2 1
"" 4.7 7897 2 1
"" 5.0 899 2 1
"" 5.5 14684 2 1
"" 5.6 5688 2 1
"" 5.9 3599 2 1
"Huawei" 0 7987 1 0
"Meizu" 0 1299 1 0
"Nokia" 0 169 1 0
"Xiaomi" 0 899 1 0
"OPPO" 0 2999 1 0
"Samsung" 0 5688 1 0
"Apple" 0 13895 1 0The GROUPING function
GROUPING(col1, ..., colN) returns an integer. Each bit in the binary representation corresponds to one input column, with the rightmost argument mapped to the least significant bit.
| Bit value | Meaning |
|---|---|
0 | The column is part of the current grouping set (value comes from the data) |
1 | The column is NOT part of the current grouping set (value is aggregation-generated default value) |
The integer result is the decimal equivalent of the binary bit string.
Reading the example results
GROUPING(brand, size) encodes two columns:
Grand total row (row 1, size = 0.0): both columns are excluded → binary
11= decimal3, soGROUPING(brand, size)=3Rows aggregated by size (rows 2–7, size = 1.4–5.9):
brandis excluded from the grouping set → bit =1(high bit)sizeis included in the grouping set → bit =0(low bit)Binary
10= decimal2, soGROUPING(brand, size)=2
Rows aggregated by brand (rows 8–14):
brandis included → high bit =0sizeis excluded → low bit =1Binary
01= decimal1, soGROUPING(brand, size)=1
GROUPING(brand) encodes a single column:
Rows 1–7 (grand total and size aggregations):
brandis excluded →GROUPING(brand)=1Rows 8–14 (aggregated by brand):
brandis included →GROUPING(brand)=0
Usage notes
The
GROUPINGfunction accepts up to 32 input parameters.A
GROUPINGresult of0for a column means the value in that column comes from the actual data — it may still be a null value or an empty string if the source data contains those values.A
GROUPINGresult of1(or any non-zero value for multi-column calls) confirms the default value was generated by the aggregation engine, not present in the source data.