通过Hint干预执行计划
云原生数据仓库AnalyticDB PostgreSQL版支持通过Hint干预执行计划,如指定SQL语句使用或不使用ORCA优化器,干预JOIN顺序和JOIN类型。
前提条件
指定SQL语句使用或不使用ORCA优化器
开启Hint干预计划并启用Hint注册规则,语句如下:
SET pg_hint_plan.enable_hint to on; SET pg_hint_plan.enable_hint_table to on;
创建用于测试的表t1和表t2,建表语句如下:
CREATE TABLE t1 (id int PRIMARY KEY, val int); CREATE TABLE t2 (id int PRIMARY KEY, val int);
分别往两张表中插入测试数据,语句如下:
INSERT INTO t1 SELECT i, i % 100 FROM (SELECT generate_series(1, 10000) i) t; INSERT INTO t2 SELECT i, i % 10 FROM (SELECT generate_series(1, 1000) i) t;
创建索引,语句如下:
CREATE INDEX t1_val ON t1 (val); CREATE INDEX t2_val ON t2 (val);
强制关闭ORCA优化器,语句如下:
SELECT * FROM hint_plan.insert_hint_table($$/*+ SET(optimizer off) */EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.val = t2.val WHERE t1.id = 1;$$);
返回信息如下:
-[ RECORD 1 ]---------+----------------------------------------------------- id | 1 norm_query_string | explain select * from t1 join t2 on t1.val = t2.val; application_name | hints | set(optimizer off) query_hash | -2169095602568752481 enable | t prepare_param_strings | {}
输出执行计划,语句如下:
EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.val = t2.val WHERE t1.id = 3;
强制关闭后,执行计划使用了Postgres query optimizer优化器,返回信息如下:
QUERY PLAN ------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=8.25..27.61 rows=11 width=16) -> Hash Join (cost=8.25..27.61 rows=4 width=16) Hash Cond: (t2.val = t1.val) -> Seq Scan on t2 (cost=0.00..13.00 rows=334 width=8) -> Hash (cost=8.22..8.22 rows=1 width=8) -> Broadcast Motion 1:3 (slice1; segments: 1) (cost=0.16..8.22 rows=3 width=8) -> Index Scan using t1_pkey on t1 (cost=0.16..8.18 rows=1 width=8) Index Cond: (id = 3) Optimizer: Postgres query optimizer (9 rows)
强制启用ORCA优化器,语句如下:
SELECT * FROM hint_plan.insert_hint_table($$/*+ SET(optimizer on) SET(rds_optimizer_options 0) */SELECT * FROM t1 WHERE t1.id = 1;$$);
返回信息如下:
-[ RECORD 1 ]---------+------------------------------------------------ id | 2 norm_query_string | select * from t1 where t1.id = $1; application_name | hints | set(optimizer on) set(rds_optimizer_options 0) query_hash | -8281826471521807124 enable | t prepare_param_strings | {}
输出执行计划,语句如下:
EXPLAIN SELECT * FROM t1 WHERE t1.id = 2;
强制启用后,执行计划使用了ORCA优化器,返回信息如下:
QUERY PLAN ---------------------------------------------------------------------------- Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..6.00 rows=1 width=8) -> Index Scan using t1_pkey on t1 (cost=0.00..6.00 rows=1 width=8) Index Cond: (id = 1) Optimizer: Pivotal Optimizer (GPORCA) version 3.86.0 (4 rows)
干预JOIN顺序和JOIN类型
开启Hint干预计划并启用Hint注册功能,语句如下:
SET optimizer to off; SET pg_hint_plan.enable_hint to on; SET pg_hint_plan.enable_hint_table to on;
创建用于测试的表t1和表t2,建表语句如下:
CREATE TABLE t1 (id int PRIMARY KEY, val int); CREATE TABLE t2 (id int PRIMARY KEY, val int);
分别往两张表中插入测试数据,语句如下:
INSERT INTO t1 SELECT i, i % 100 FROM (SELECT generate_series(1, 10000) i) t; INSERT INTO t2 SELECT i, i % 10 FROM (SELECT generate_series(1, 1000) i) t;
创建索引,语句如下:
CREATE INDEX t1_val ON t1 (val); CREATE INDEX t2_val ON t2 (val);
输出执行计划,语句如下:
EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;
此处为默认执行计划,JOIN类型和JOIN顺序为
HashJoin(t1 t2) Leading((t1 t2))
,返回信息如下:QUERY PLAN ---------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=25.50..208.00 rows=800 width=16) -> Hash Join (cost=25.50..208.00 rows=267 width=16) Hash Cond: (t1.id = t2.id) -> Seq Scan on t1 (cost=0.00..137.00 rows=3334 width=8) Filter: (val < 100) -> Hash (cost=15.50..15.50 rows=267 width=8) -> Seq Scan on t2 (cost=0.00..15.50 rows=267 width=8) Filter: (val > 1) Optimizer: Postgres query optimizer
以下几种方法可以干预JOIN顺序和JOIN类型:
在输出执行计划的语句前添加Hint,语句如下:
/*+ MergeJoin(t1 t2) Leading((t2 t1)) */ EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;
JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:
QUERY PLAN ----------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=0.31..93.54 rows=800 width=16) -> Merge Join (cost=0.31..93.54 rows=267 width=16) Merge Cond: (t2.id = t1.id) -> Index Scan using t2_pkey on t2 (cost=0.15..52.90 rows=267 width=8) Filter: (val > 1) -> Index Scan using t1_pkey on t1 (cost=0.16..279.91 rows=3334 width=8) Filter: (val < 100) Optimizer: Postgres query optimizer (8 rows)
先注册Hint再输出执行计划:
注册SQL语句模板,语句如下:
SELECT * FROM hint_plan.insert_hint_table($$/*+ MergeJoin(t1 t2) Leading((t2 t1)) */SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);
返回信息如下:
-[ RECORD 1 ]---------+-------------------------------------------------------------------------- id | 1 norm_query_string | SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < $1 and t2.val > $2; application_name | hints | MergeJoin(t1 t2) Leading((t2 t1)) query_hash | -4733464863014584191 enable | t prepare_param_strings | {}
输出执行计划,语句如下:
EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;
Hint已生效,JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:
QUERY PLAN ----------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=0.31..93.54 rows=800 width=16) -> Merge Join (cost=0.31..93.54 rows=267 width=16) Merge Cond: (t2.id = t1.id) -> Index Scan using t2_pkey on t2 (cost=0.15..52.90 rows=267 width=8) Filter: (val > 1) -> Index Scan using t1_pkey on t1 (cost=0.16..279.91 rows=3334 width=8) Filter: (val < 100) Optimizer: Postgres query optimizer (8 rows)
执行相同SQL语句模板的查询计划,语句如下:
EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 80 and t2.val > 20;
返回信息如下,可以看到Hint也可以对同SQL语句模板的其它SQL语句生效:
QUERY PLAN ----------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=8.25..38.24 rows=13 width=16) -> Merge Join (cost=8.25..38.24 rows=5 width=16) Merge Cond: (t1.id = t2.id) -> Index Scan using t1_pkey on t1 (cost=0.16..279.91 rows=2667 width=8) Filter: (val < 80) -> Sort (cost=8.09..8.09 rows=1 width=8) Sort Key: t2.id -> Index Scan using t2_val on t2 (cost=0.15..8.08 rows=1 width=8) Index Cond: (val > 20) Optimizer: Postgres query optimizer (10 rows)
可选:删除所有该参数化语句的Hint,语句如下:
SELECT * FROM hint_plan.delete_all_hint_table($$SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);
语句中部分常数固定,注册Hint后执行计划:
使用PREPARE语句注册SQL语句模板,语句如下:
SELECT * FROM hint_plan.insert_hint_table($$/*+ MergeJoin(t1 t2) Leading((t2 t1)) */PREPARE a AS SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > $1;$$);
返回信息如下:
-[ RECORD 1 ]---------+-------------------------------------------------------------------------- id | 1 norm_query_string | PREPARE a AS SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < $1 and t2.val > $2; application_name | hints | MergeJoin(t1 t2) Leading((t2 t1)) query_hash | -4733464863014584191 enable | t prepare_param_strings | {100,$1}
执行变更了参数化常数的执行计划,语句如下:
EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 2;
参数化常数变化后,Hint依然可以生效,JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:
QUERY PLAN ----------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=0.31..93.54 rows=800 width=16) -> Merge Join (cost=0.31..93.54 rows=267 width=16) Merge Cond: (t1.id = t2.id) -> Index Scan using t1_pkey on t1 (cost=0.16..279.91 rows=3334 width=8) Filter: (val < 100) -> Index Scan using t2_pkey on t2 (cost=0.15..52.90 rows=267 width=8) Filter: (val > 2) Optimizer: Postgres query optimizer (8 rows)
执行变更了固定常数的执行计划,语句如下:
EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 80 and t2.val > 20;
固定常数变化后,Hint失效,不再干预执行计划,返回信息如下:
QUERY PLAN ----------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) (cost=8.09..175.22 rows=13 width=16) -> Hash Join (cost=8.09..175.22 rows=5 width=16) Hash Cond: (t1.id = t2.id) -> Seq Scan on t1 (cost=0.00..137.00 rows=2667 width=8) Filter: (val < 80) -> Hash (cost=8.08..8.08 rows=1 width=8) -> Index Scan using t2_val on t2 (cost=0.15..8.08 rows=1 width=8) Index Cond: (val > 20) Optimizer: Postgres query optimizer (9 rows)
可选:删除所有该参数化语句的Hint,语句如下:
SELECT * FROM hint_plan.delete_all_hint_table($$SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);