CTE Reuse策略优化

CTE(Common Table Expression,公用表表达式)在SQL中用于简化复杂查询并提升可读性。当使用CTE时,优化器会自动将CTE物化处理,即将结果临时存储,以便在后续SQL查询中进行多次调用。CTE Reuse策略则是指优化器如何有效重用CTE,以避免重复计算,从而提高查询性能。

设置CTE Reuse策略

Hologres中,优化器可以通过SQL的执行计划自适应决定CTE复用,在Hologres V3.2之前版本中是通过GUC参数optimizer_cte_inlining来控制。自Hologres V3.2版本开始,通过GUC参数optimizer_cte_inlininghg_cte_strategy控制CTE是否复用,详情介绍如下。

  • Hologres V3.2及以上版本

    SET hg_cte_strategy ={AUTO|INLINING|REUSE};
    • 参数说明:

      • AUTO(默认值):优化器自适应决定CTE是否复用。

      • INLINING:强制Inline,即不复用CTE,每次CTE引用时重新计算子查询,适用于CTE计算简单,结果集较小的场景。

      • REUSE:强制Reuse,即仅计算一次,CTE结果缓存后复用,适用于CTE计算复杂,结果集较大且被多次引用的场景。

    • 使用说明

      • 当设置optimizer_cte_inlining = off时,CTE策略为Reuse,且不受hg_cte_strategy的参数值影响。

      • 当设置optimizer_cte_inlining = on(无论是默认值还是显式设置)时,不再强制Inline,CTE策略由hg_cte_strategy的取值决定。

    说明
    • hg_cte_strategyoptimizer_cte_inlining为两个独立参数,可以分别设置,与设置顺序无关。

    • optimizer_cte_inlining = offhg_cte_strategy = INLINING无法同时设置。

  • Hologres V3.2以下版本

    SET optimizer_cte_inlining ={on|off}

    参数说明:

    • 当设置optimizer_cte_inlining = on时(默认值),CTE强制Inline,默认值。

    • 当设置optimizer_cte_inlining = off 时,CTE策略为Reuse。

使用示例

通过如下示例展示hg_cte_strategy设置不同参数时,执行计划的区别。

  1. 准备示例数据。

    CREATE TABLE t1 (
        a INT,
        b INT,
        c INT
    );
  2. 使用hg_cte_strategy的不同策略查看执行计划。

    • hg_cte_strategy为默认值AUTO时,执行计划如下。从执行计划来看,只有cte1复用了,cte2没有复用。

      EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1),cte2 AS (SELECT a,b,c FROM t1)
      SELECT * FROM cte1 
      UNION ALL 
      SELECT * FROM cte1
      UNION ALL
      SELECT * FROM cte2
      UNION ALL
      SELECT * FROM cte2;

      返回结果如下。

      QUERY PLAN
      Gather  (cost=0.00..25.00 rows=4 width=12)
        CTE cte1  (cost=0.00..5.00 rows=1 width=12)
            ->  Forward  (cost=0.00..5.00 rows=1 width=12)
                  ->  HashAggregate  (cost=0.00..5.00 rows=1 width=12)
                        Group Key: t1_2.a, t1_2.b, t1_2.c
                        ->  Redistribution  (cost=0.00..5.00 rows=1 width=12)
                              Hash Key: t1_2.a, t1_2.b, t1_2.c
                              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                                    ->  Seq Scan on t1 t1_2  (cost=0.00..5.00 rows=1 width=12)
        ->  Append  (cost=0.00..20.00 rows=4 width=12)
              ->  CTE Scan on cte1  (cost=0.00..5.00 rows=1 width=12)
              ->  CTE Scan on cte1  (cost=0.00..5.00 rows=1 width=12)
              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                    ->  Seq Scan on t1  (cost=0.00..5.00 rows=1 width=12)
              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                    ->  Seq Scan on t1 t1_1  (cost=0.00..5.00 rows=1 width=12)
      Query Queue: init_warehouse.default_queue
      Optimizer: HQO version 3.2.0
    • hg_cte_strategyINLINING时,执行计划如下。可以看到执行计划中没有CTE复用。

      SET hg_cte_strategy = INLINING;
      EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1), 
      cte2 AS (SELECT a,b,c FROM t1) 
      SELECT * FROM cte1 
      UNION ALL 
      SELECT * FROM cte1 
      UNION ALL 
      SELECT * FROM cte2 
      UNION ALL 
      SELECT * FROM cte2;

      返回结果如下。

      QUERY PLAN
      Gather  (cost=0.00..20.00 rows=4 width=12)
        ->  Append  (cost=0.00..20.00 rows=4 width=12)
              ->  HashAggregate  (cost=0.00..5.00 rows=1 width=12)
                    Group Key: t1.a, t1.b, t1.c
                    ->  Redistribution  (cost=0.00..5.00 rows=1 width=12)
                          Hash Key: t1.a, t1.b, t1.c
                          ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                                ->  Seq Scan on t1  (cost=0.00..5.00 rows=1 width=12)
              ->  HashAggregate  (cost=0.00..5.00 rows=1 width=12)
                    Group Key: t1_1.a, t1_1.b, t1_1.c
                    ->  Redistribution  (cost=0.00..5.00 rows=1 width=12)
                          Hash Key: t1_1.a, t1_1.b, t1_1.c
                          ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                                ->  Seq Scan on t1 t1_1  (cost=0.00..5.00 rows=1 width=12)
              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                    ->  Seq Scan on t1 t1_2  (cost=0.00..5.00 rows=1 width=12)
              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                    ->  Seq Scan on t1 t1_3  (cost=0.00..5.00 rows=1 width=12)
      Query Queue: init_warehouse.default_queue
      Optimizer: HQO version 3.2.0
    • hg_cte_strategyREUSE时,执行计划如下,可以看到cte1cte2都有复用。

      SET hg_cte_strategy = REUSE;
      EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1), 
      cte2 AS (SELECT a,b,c FROM t1) 
      SELECT * FROM cte1 
      UNION ALL 
      SELECT * FROM cte1 
      UNION ALL
      SELECT * FROM cte2 
      UNION ALL
      SELECT * FROM cte2;

      返回结果如下。

      QUERY PLAN
      Gather  (cost=0.00..30.00 rows=4 width=12)
        CTE cte1  (cost=0.00..5.00 rows=1 width=12)
            ->  Forward  (cost=0.00..5.00 rows=1 width=12)
                  ->  HashAggregate  (cost=0.00..5.00 rows=1 width=12)
                        Group Key: t1_1.a, t1_1.b, t1_1.c
                        ->  Redistribution  (cost=0.00..5.00 rows=1 width=12)
                              Hash Key: t1_1.a, t1_1.b, t1_1.c
                              ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                                    ->  Seq Scan on t1 t1_1  (cost=0.00..5.00 rows=1 width=12)
        CTE cte2  (cost=0.00..5.00 rows=1 width=12)
            ->  Forward  (cost=0.00..5.00 rows=1 width=12)
                  ->  Local Gather  (cost=0.00..5.00 rows=1 width=12)
                        ->  Seq Scan on t1  (cost=0.00..5.00 rows=1 width=12)
        ->  Append  (cost=0.00..20.00 rows=4 width=12)
              ->  CTE Scan on cte1  (cost=0.00..5.00 rows=1 width=12)
              ->  CTE Scan on cte1  (cost=0.00..5.00 rows=1 width=12)
              ->  CTE Scan on cte2  (cost=0.00..5.00 rows=1 width=12)
              ->  CTE Scan on cte2  (cost=0.00..5.00 rows=1 width=12)
      Query Queue: init_warehouse.default_queue
      Optimizer: HQO version 3.2.0