CTE(Common Table Expression,公用表表达式)在SQL中用于简化复杂查询并提升可读性。当使用CTE时,优化器会自动将CTE物化处理,即将结果临时存储,以便在后续SQL查询中进行多次调用。CTE Reuse策略则是指优化器如何有效重用CTE,以避免重复计算,从而提高查询性能。
设置CTE Reuse策略
在Hologres中,优化器可以通过SQL的执行计划自适应决定CTE复用,在Hologres V3.2之前版本中是通过GUC参数optimizer_cte_inlining
来控制。自Hologres V3.2版本开始,通过GUC参数optimizer_cte_inlining
和hg_cte_strategy
控制CTE是否复用,详情介绍如下。
Hologres V3.2及以上版本
SET hg_cte_strategy ={AUTO|INLINING|REUSE};
参数说明:
AUTO(默认值):优化器自适应决定CTE是否复用。
INLINING:强制Inline,即不复用CTE,每次CTE引用时重新计算子查询,适用于CTE计算简单,结果集较小的场景。
REUSE:强制Reuse,即仅计算一次,CTE结果缓存后复用,适用于CTE计算复杂,结果集较大且被多次引用的场景。
使用说明
当设置
optimizer_cte_inlining = off
时,CTE策略为Reuse,且不受hg_cte_strategy
的参数值影响。当设置
optimizer_cte_inlining = on
(无论是默认值还是显式设置)时,不再强制Inline,CTE策略由hg_cte_strategy
的取值决定。
说明hg_cte_strategy
和optimizer_cte_inlining
为两个独立参数,可以分别设置,与设置顺序无关。optimizer_cte_inlining = off
和hg_cte_strategy = INLINING
无法同时设置。
Hologres V3.2以下版本
SET optimizer_cte_inlining ={on|off}
参数说明:
当设置
optimizer_cte_inlining = on
时(默认值),CTE强制Inline,默认值。当设置
optimizer_cte_inlining = off
时,CTE策略为Reuse。
使用示例
通过如下示例展示hg_cte_strategy
设置不同参数时,执行计划的区别。
准备示例数据。
CREATE TABLE t1 ( a INT, b INT, c INT );
使用
hg_cte_strategy
的不同策略查看执行计划。当
hg_cte_strategy
为默认值AUTO时,执行计划如下。从执行计划来看,只有cte1复用了,cte2没有复用。EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1),cte2 AS (SELECT a,b,c FROM t1) SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 UNION ALL SELECT * FROM cte2 UNION ALL SELECT * FROM cte2;
返回结果如下。
QUERY PLAN Gather (cost=0.00..25.00 rows=4 width=12) CTE cte1 (cost=0.00..5.00 rows=1 width=12) -> Forward (cost=0.00..5.00 rows=1 width=12) -> HashAggregate (cost=0.00..5.00 rows=1 width=12) Group Key: t1_2.a, t1_2.b, t1_2.c -> Redistribution (cost=0.00..5.00 rows=1 width=12) Hash Key: t1_2.a, t1_2.b, t1_2.c -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_2 (cost=0.00..5.00 rows=1 width=12) -> Append (cost=0.00..20.00 rows=4 width=12) -> CTE Scan on cte1 (cost=0.00..5.00 rows=1 width=12) -> CTE Scan on cte1 (cost=0.00..5.00 rows=1 width=12) -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 (cost=0.00..5.00 rows=1 width=12) -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_1 (cost=0.00..5.00 rows=1 width=12) Query Queue: init_warehouse.default_queue Optimizer: HQO version 3.2.0
当
hg_cte_strategy
为INLINING时,执行计划如下。可以看到执行计划中没有CTE复用。SET hg_cte_strategy = INLINING; EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1), cte2 AS (SELECT a,b,c FROM t1) SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 UNION ALL SELECT * FROM cte2 UNION ALL SELECT * FROM cte2;
返回结果如下。
QUERY PLAN Gather (cost=0.00..20.00 rows=4 width=12) -> Append (cost=0.00..20.00 rows=4 width=12) -> HashAggregate (cost=0.00..5.00 rows=1 width=12) Group Key: t1.a, t1.b, t1.c -> Redistribution (cost=0.00..5.00 rows=1 width=12) Hash Key: t1.a, t1.b, t1.c -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 (cost=0.00..5.00 rows=1 width=12) -> HashAggregate (cost=0.00..5.00 rows=1 width=12) Group Key: t1_1.a, t1_1.b, t1_1.c -> Redistribution (cost=0.00..5.00 rows=1 width=12) Hash Key: t1_1.a, t1_1.b, t1_1.c -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_1 (cost=0.00..5.00 rows=1 width=12) -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_2 (cost=0.00..5.00 rows=1 width=12) -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_3 (cost=0.00..5.00 rows=1 width=12) Query Queue: init_warehouse.default_queue Optimizer: HQO version 3.2.0
当
hg_cte_strategy
为REUSE时,执行计划如下,可以看到cte1和cte2都有复用。SET hg_cte_strategy = REUSE; EXPLAIN WITH cte1 AS (SELECT DISTINCT a,b,c FROM t1), cte2 AS (SELECT a,b,c FROM t1) SELECT * FROM cte1 UNION ALL SELECT * FROM cte1 UNION ALL SELECT * FROM cte2 UNION ALL SELECT * FROM cte2;
返回结果如下。
QUERY PLAN Gather (cost=0.00..30.00 rows=4 width=12) CTE cte1 (cost=0.00..5.00 rows=1 width=12) -> Forward (cost=0.00..5.00 rows=1 width=12) -> HashAggregate (cost=0.00..5.00 rows=1 width=12) Group Key: t1_1.a, t1_1.b, t1_1.c -> Redistribution (cost=0.00..5.00 rows=1 width=12) Hash Key: t1_1.a, t1_1.b, t1_1.c -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 t1_1 (cost=0.00..5.00 rows=1 width=12) CTE cte2 (cost=0.00..5.00 rows=1 width=12) -> Forward (cost=0.00..5.00 rows=1 width=12) -> Local Gather (cost=0.00..5.00 rows=1 width=12) -> Seq Scan on t1 (cost=0.00..5.00 rows=1 width=12) -> Append (cost=0.00..20.00 rows=4 width=12) -> CTE Scan on cte1 (cost=0.00..5.00 rows=1 width=12) -> CTE Scan on cte1 (cost=0.00..5.00 rows=1 width=12) -> CTE Scan on cte2 (cost=0.00..5.00 rows=1 width=12) -> CTE Scan on cte2 (cost=0.00..5.00 rows=1 width=12) Query Queue: init_warehouse.default_queue Optimizer: HQO version 3.2.0