Window Function的并行加速

Window Function是社区版MySQL 8.0为提升查询分析能力而引入的功能,在PolarDB MySQL版8.0版本中对Window Function进行了全面支持,并可以做并行执行。

前提条件

  • 集群版本需为PolarDB MySQL版8.0集群版,且Revision version为8.0.2.2.0或以上,您可以参见查询版本号确认集群版本。

  • 在Window Function中只有使用了Partition by子句,才可以并行执行查询。

使用方法

  • 语法:

    目前在PolarDB中Window Function只能通过EXPLAIN FORMAT=TREE语句来显示。

  • 示例:

    如下示例中创建了1张表employee_salaries,且在表里插入了一些数据:

    CREATE TABLE `employee_salaries` (
      `dept` varchar(20) DEFAULT NULL,
      `name` varchar(20) DEFAULT NULL,
      `salary` int DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
    INSERT INTO `employee_salaries` VALUES
    ('Engineering','Dharma',3500),
    ('Engineering','Bình',3000),
    ('Engineering','Adalynn',2800),
    ('Engineering','Samuel',2500),
    ('Engineering','Cveta',2200),
    ('Engineering','eve',2000),
    ('Engineering','Dharma',3500),
    ('Sales','Carbry',500),
    ('Sales','Clytemnestra',400),
    ('Sales','Juraj',300),
    ('Sales','Kalpana',300),
    ('Sales','Svantepolk',250),
    ('Sales','Angelo',200);

    如下是employee_salaries表的并行查询计划:

    explain format=tree select ROW_NUMBER() OVER(partition by dept order by salary desc) AS 'row_number' from employee_salaries\G
    *************************** 1. row ***************************
    EXPLAIN:
    -> Gather (slice: 1; workers: 4)  (cost=26.42 rows=12)
        -> Window aggregate  (cost=15.67 rows=3)
            -> Repartition (hash keys: employee_salaries.dept; slice: 2; workers: 4)  (cost=15.33 rows=3)
                -> Sort: employee_salaries.dept, employee_salaries.salary DESC  (cost=1.55 rows=13)
                    -> Parallel table scan on employee_salaries, with parallel partitions: 4

    如上所示,在employee_salaries表完成并行扫描后,会将数据按照Partition by子句中指定的key(employee_salaries.dept)分发到下一阶段的Worker中,从而保证Window Function可以完成并行计算且保证结果的正确性,最终在Leader上完成汇总即可。