GROUP BY

更新时间:2025-03-04 01:37:37

GROUP BY用于对查询结果进行分组聚合。您也可以在GROUP BY子句中使用GROUPING SETSWITH CUBEROLLUP,以不同的形式展示分组结果。

GROUP BY expression [, ...]    

注意事项

  • 查询中须使用标准聚合函数SUMAVGCOUNT等)声明非分组列,否则系统将默认使用ARBITRARY函数处理非分组列,随机返回非分组列中的任意一个值。

  • 不支持启用ONLY_FULL_GROUP_BY模式。

GROUPING SETS

当您需要对同一数据集进行多个维度的聚合统计,并将这些结果合并到一个结果集中,可以使用GROUPING SETSGROUPING SETS用于在同一结果集中指定多个GROUP BY选项,作用相当于多个GROUP BY查询的UNION组合形式。

SELECT origin_state, origin_zip, destination_state, sum(package_weight)
FROM shipping
GROUP BY GROUPING SETS (
    (origin_state),
    (origin_state, origin_zip),
    (destination_state));         

上述示例等同于:

SELECT origin_state, NULL, NULL, sum(package_weight)
FROM shipping GROUP BY origin_state
UNION ALL
SELECT origin_state, origin_zip, NULL, sum(package_weight)
FROM shipping GROUP BY origin_state, origin_zip
UNION ALL
SELECT NULL, NULL, destination_state, sum(package_weight)
FROM shipping GROUP BY destination_state;       

CUBE

CUBE作用类似于GROUPING SETS,但它会自动生成所有可能的分组组合,不需要手动列出分组。

SELECT origin_state, destination_state, sum(package_weight)
FROM shipping
GROUP BY origin_state, destination_state WITH CUBE;    

上述示例等同于:

SELECT origin_state, destination_state, sum(package_weight)
FROM shipping
GROUP BY GROUPING SETS (
    (origin_state, destination_state),
    (origin_state),
    (destination_state),
    ());        

ROLLUP

ROLLUP 会对每个分组列生成层次化的分组组合。它会按照指定的列顺序,逐步减少分组的维度,直到最终生成全局汇总。

SELECT origin_state, origin_zip, destination_state, SUM(package_weight)
FROM shipping
GROUP BY ROLLUP (origin_state, origin_zip, destination_state);          

上述示例等同于:

SELECT origin_state, origin_zip, destination_state, SUM(package_weight)
FROM shipping
GROUP BY GROUPING SETS (
    (origin_state, origin_zip, destination_state),
    (origin_state, origin_zip), 
    (origin_state), 
    ()
    );          

注意事项

AnalyticDB for MySQL不支持GROUP BY ROLLUP (...)后加列名,即以下SQL语句会报错。

SELECT origin_state, origin_zip, destination_state, SUM(package_weight)
FROM shipping
GROUP BY ROLLUP (origin_state, origin_zip), destination_state;

如果您想以层级方式实现GROUP BY查询的UNION组合,可通过GROUP BY GROUPING SETS ((origin_state, origin_zip), (origin_state), ()), destination_state实现。

示例

sales表为例,演示GROUP BYGROUPING SETSWITH CUBEROLLUP的效果。

sales表的建表语句与写入语句如下。

CREATE TABLE sales (
    year INT,
    region VARCHAR(50),
    amount DECIMAL(10, 2)
);

INSERT INTO sales (year, region, amount) VALUES
(2020, 'North', 1000.00),
(2020, 'South', 1500.00),
(2020, 'East', 1200.00),
(2020, 'West', 1300.00),
(2021, 'North', 2000.00),
(2021, 'South', 2500.00),
(2021, 'East', 2200.00),
(2021, 'West', 2300.00),
(2022, 'North', 3000.00),
(2022, 'South', 3500.00),
(2022, 'East', 3200.00),
(2022, 'West', 3300.00);

示例1:GROUP BY查询

以下查询声明yearregion为分组列,计算amount列的分组总和。

SELECT year, region, SUM(amount) as sum_amount
FROM sales
GROUP BY year, region
ORDER BY 3, 2, 1 --按第3列sum_amount,第2列region,第1列year排序。
LIMIT 5;

返回结果如下:

year | region | sum_amount 
-----+--------+------------
2020 |  North |   1000.00  
2020 |  East  |   1200.00  
2020 |  West  |   1300.00  
2020 |  South |   1500.00  
2021 |  North |   2000.00  
(5 rows)         

GROUP BY子句中的表达式也可以使用序号来引用所需的列。上述示例可改写为以下形式。

SELECT year, region, SUM(amount) as sum_amount
FROM sales
GROUP BY 1,2  --按year,region分组
ORDER BY 3, 2, 1
LIMIT 5;       

返回结果如下:

year | region | sum_amount 
-----+--------+------------
2020 |  North |   1000.00  
2020 |  East  |   1200.00  
2020 |  West  |   1300.00  
2020 |  South |   1500.00  
2021 |  North |   2000.00  
(5 rows)         

示例2:使用GROUPING SETS,将按yearregion分组汇总的结果,按year分组汇总的结果,按region分组汇总的结果,以及全局汇总的结果合并到一个结果集。

SELECT year, region, SUM(amount) as sum_amount
FROM sales
GROUP BY GROUPING SETS (
    (year,region), --按year和region分组汇总。
    (region), --按year分组汇总。
    (year), --按region分组汇总
    () --全局汇总
    );

返回结果如下:

year | region | sum_amount 
-----+--------+------------
2020 |  North |   1000.00  
2020 |  East  |   1200.00  
2020 |  West  |   1300.00  
2020 |  South |   1500.00  
2021 |  North |   2000.00
2021 |  East  |   2200.00  
2021 |  West  |   2300.00  
2021 |  South |   2500.00  
2022 |  North |   3000.00  
2022 |  East  |   3200.00  
2022 |  West  |   3300.00  
2022 |  South |   3500.00
NULL |  North |   6000.00  
NULL |  East  |   6600.00  
NULL |  West  |   6900.00  
NULL |  South |   7500.00
2020 |  NULL  |   5000.00  
2021 |  NULL  |   9000.00
2022 |  NULL  |  13000.00
NULL |  NULL  |  27000.00
(20 rows)         

示例3:使用GROUP BY ... WITH CUBE,自动生成所有可能的分组组合,实现和示例2 GROUPING SETS相同的效果。

SELECT year, region, SUM(amount) as sum_amount
FROM sales
GROUP BY year,region WITH CUBE;

返回效果如下:

year | region | sum_amount 
-----+--------+------------
2020 |  North |   1000.00  
2020 |  East  |   1200.00  
2020 |  West  |   1300.00  
2020 |  South |   1500.00  
2021 |  North |   2000.00
2021 |  East  |   2200.00  
2021 |  West  |   2300.00  
2021 |  South |   2500.00  
2022 |  North |   3000.00  
2022 |  East  |   3200.00  
2022 |  West  |   3300.00  
2022 |  South |   3500.00
NULL |  North |   6000.00  
NULL |  East  |   6600.00  
NULL |  West  |   6900.00  
NULL |  South |   7500.00
2020 |  NULL  |   5000.00  
2021 |  NULL  |   9000.00
2022 |  NULL  |  13000.00
NULL |  NULL  |  27000.00
(20 rows)         

示例3:使用GROUP BY ROLLUP(...),按指定列的顺序,先生成最细粒度的汇总结果(按yearregion聚合),再逐步减少分组维度(按year聚合),最终生成全局汇总结果。

SELECT year, region, SUM(amount) as sum_amount
FROM sales
GROUP BY ROLLUP(year,region);

返回结果如下:

year | region | sum_amount 
-----+--------+------------
2020 |  North |   1000.00  
2020 |  East  |   1200.00  
2020 |  West  |   1300.00  
2020 |  South |   1500.00  
2021 |  North |   2000.00
2021 |  East  |   2200.00  
2021 |  West  |   2300.00  
2021 |  South |   2500.00  
2022 |  North |   3000.00  
2022 |  East  |   3200.00  
2022 |  West  |   3300.00  
2022 |  South |   3500.00
2020 |  NULL  |   5000.00  
2021 |  NULL  |   9000.00
2022 |  NULL  |  13000.00
NULL |  NULL  |  27000.00
(16 rows)         
  • 本页导读 (1)
  • 注意事项
  • GROUPING SETS
  • CUBE
  • ROLLUP
  • 示例