在关系型数据库中,通常需要使用多个SELECT + UNION语句来实现按照多组维度的结果分组,PolarDB-X新增支持通过Grouping Sets、Rollup和Cube扩展来实现这一目的。此外,PolarDB-X还支持在SELECT命令或HAVING子句中使用GROUPING函数和GROUPING_ID函数,来帮助解释使用上述扩展的结果。本文将介绍相关语法和示例。

注意事项

  • 本文介绍的所有GROUP BY相关的扩展语法,均不支持查询下推至LogicalView算子中执行。关于查询下推,请参见查询改写与下推
  • 本文示例中所用测试数据信息如下:
    使用如下语句创建一张requests表:
    CREATE TABLE requests (
      `id` int(10) UNSIGNED NOT NULL,
      `os` varchar(20) DEFAULT NULL,
      `device` varchar(20) DEFAULT NULL,
      `city` varchar(20) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8 dbpartition BY hash(`id`) tbpartition BY hash(`id`);
    requests表中使用如下语句插入测试所需的数据:
    INSERT INTO requests (id, os, device, city) VALUES
    (1, 'windows', 'PC', 'Beijing'),
    (2, 'windows', 'PC', 'Shijiazhuang'),
    (3, 'linux', 'Phone', 'Beijing'),
    (4, 'windows', 'PC', 'Beijing'),
    (5, 'ios', 'Phone', 'Shijiazhuang'),
    (6, 'linux', 'PC', 'Beijing'),
    (7, 'windows', 'Phone', 'Shijiazhuang');

GROUPING SETS扩展

  • 功能介绍

    GROUPING SETS是GROUP BY子句的扩展,可以生成一个结果集,该结果集实际上是基于不同分组的多个结果集的串联(与UNION ALL运算结果类似),但UNION ALL运算和GROUPING SETS扩展并不会消除或合并结果集中的重复行。

  • 语法
    GROUPING SETS (
      { expr_1 | ( expr_1a [, expr_1b ] ...) |
        ROLLUP ( expr_list ) | CUBE ( expr_list )
      } [, ...] )
    说明 GROUPING SETS扩展可包含一个或多个由半角逗号(,)分隔表达式(如expr_1(expr_1a [, expr_1b ] ...))的任意组合,以及带半角圆括号(())的表达式列表(如( expr_list )),其中:
    • 每个表达式都可用于确定结果集的分组方式。
    • GROUPING SETS内也支持嵌套使用ROLLUP或者CUBE。
  • 示例
    • 通过GROUPING SETS扩展对数据进行分组查询,语法如下:
      select os,device, city ,count(*)
      from requests
      group by grouping sets((os, device), (city), ());

      上述语句等效于如下语句:

      select os, device, NULL, count(*)
      from requests group by os, device
      union all
      select NULL, NULL, NULL, count(*)
      from requests
      union all
      select null, null, city, count(*)
      from requests group by city;
      返回结果如下:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | NULL    | NULL   | Shijiazhuang |        3 |
      | NULL    | NULL   | Beijing      |        4 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
      说明 未在分组集中使用的表达式,会用NULL充当占位符,便于对这些未在分组集使用的结果集进行操作,例如结果city列中显示为NULL的行。
    • 通过在GROUPING SETS中嵌套ROLLUP来对数据进行分组,语法如下:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), ROLLUP(os, device));
      
      上述语句等效于如下语句:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), (os), (os, device), ());
      返回结果如下:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | NULL    | NULL   | Shijiazhuang |        3 |
      | NULL    | NULL   | Beijing      |        4 |
      | windows | PC     | NULL         |        3 |
      | linux   | PC     | NULL         |        1 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | linux   | NULL   | NULL         |        2 |
      | ios     | NULL   | NULL         |        1 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • 通过在GROUPING SETS中嵌套CUBE扩展来对数据进行分组,语法如下:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), CUBE(os, device));
      
      上述语句等效于如下语句:
      select os,device, city ,count(*) from requests 
      group by grouping sets((city), (os), (os, device), (), (device));
      返回结果如下:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | NULL    | NULL   | Beijing      |        4 |
      | NULL    | NULL   | Shijiazhuang |        3 |
      | windows | PC     | NULL         |        3 |
      | ios     | Phone  | NULL         |        1 |
      | linux   | Phone  | NULL         |        1 |
      | windows | Phone  | NULL         |        1 |
      | linux   | PC     | NULL         |        1 |
      | windows | NULL   | NULL         |        4 |
      | ios     | NULL   | NULL         |        1 |
      | linux   | NULL   | NULL         |        2 |
      | NULL    | PC     | NULL         |        4 |
      | NULL    | Phone  | NULL         |        3 |
      | NULL    | NULL   | NULL         |        7 |
      +---------+--------+--------------+----------+
    • 通过GROUP BY、CUBE和GROUPING SETS组合产生GROUPING SETS,示例如下:
      select os,device, city, count(*)
      from requests 
      group by os, cube(os,device), grouping sets(city);
      
      上述语句等效于如下语句:
      select os,device, city, count(*)
      from requests 
      group by grouping sets((os,device,city),(os,city),(os,device,city));
      返回结果如下:
      +---------+--------+--------------+----------+
      | os      | device | city         | count(*) |
      +---------+--------+--------------+----------+
      | linux   | Phone  | Beijing      |        1 |
      | windows | Phone  | Shijiazhuang |        1 |
      | windows | PC     | Shijiazhuang |        1 |
      | linux   | PC     | Beijing      |        1 |
      | windows | PC     | Beijing      |        2 |
      | ios     | Phone  | Shijiazhuang |        1 |
      | linux   | NULL   | Beijing      |        2 |
      | windows | NULL   | Shijiazhuang |        2 |
      | windows | NULL   | Beijing      |        2 |
      | ios     | NULL   | Shijiazhuang |        1 |
      +---------+--------+--------------+----------+

ROLLUP扩展

  • 功能介绍

    ROLLUP扩展生成一系列有总计的分层组,每个分层组都有小计。该层次结构的顺序由ROLLUP表达式列表中给定的表达式的顺序确定。该层次结构的顶部是列表中最左侧的项。每个连续项