通过Hint干预执行计划

云原生数据仓库AnalyticDB PostgreSQL版支持通过Hint干预执行计划,如指定SQL语句使用或不使用ORCA优化器,干预JOIN顺序和JOIN类型。

前提条件

  • 实例内核小版本为V6.3.7.0及以上版本,如何查看和升级内核小版本,请参见版本升级

  • 已安装pg_hint_plan插件,具体信息,请参见使用Hint

指定SQL语句使用或不使用ORCA优化器

  1. 开启Hint干预计划并启用Hint注册规则,语句如下:

    SET pg_hint_plan.enable_hint to on;
    SET pg_hint_plan.enable_hint_table to on;
  2. 创建用于测试的表t1和表t2,建表语句如下:

    CREATE TABLE t1 (id int PRIMARY KEY, val int);
    CREATE TABLE t2 (id int PRIMARY KEY, val int);
  3. 分别往两张表中插入测试数据,语句如下:

    INSERT INTO t1 SELECT i, i % 100 FROM (SELECT generate_series(1, 10000) i) t;
    INSERT INTO t2 SELECT i, i % 10 FROM (SELECT generate_series(1, 1000) i) t;
  4. 创建索引,语句如下:

    CREATE INDEX t1_val ON t1 (val);
    CREATE INDEX t2_val ON t2 (val);
  5. 强制关闭ORCA优化器,语句如下:

    SELECT * FROM hint_plan.insert_hint_table($$/*+ SET(optimizer off) */EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.val = t2.val WHERE t1.id = 1;$$);

    返回信息如下:

    -[ RECORD 1 ]---------+-----------------------------------------------------
    id                    | 1
    norm_query_string     | explain select * from t1 join t2 on t1.val = t2.val;
    application_name      |
    hints                 | set(optimizer off)
    query_hash            | -2169095602568752481
    enable                | t
    prepare_param_strings | {}
  6. 输出执行计划,语句如下:

    EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.val = t2.val WHERE t1.id = 3;

    强制关闭后,执行计划使用了Postgres query optimizer优化器,返回信息如下:

                                               QUERY PLAN
    -------------------------------------------------------------------------------------------------
     Gather Motion 3:1  (slice2; segments: 3)  (cost=8.25..27.61 rows=11 width=16)
       ->  Hash Join  (cost=8.25..27.61 rows=4 width=16)
             Hash Cond: (t2.val = t1.val)
             ->  Seq Scan on t2  (cost=0.00..13.00 rows=334 width=8)
             ->  Hash  (cost=8.22..8.22 rows=1 width=8)
                   ->  Broadcast Motion 1:3  (slice1; segments: 1)  (cost=0.16..8.22 rows=3 width=8)
                         ->  Index Scan using t1_pkey on t1  (cost=0.16..8.18 rows=1 width=8)
                               Index Cond: (id = 3)
     Optimizer: Postgres query optimizer
    (9 rows)
  7. 强制启用ORCA优化器,语句如下:

    SELECT * FROM hint_plan.insert_hint_table($$/*+ SET(optimizer on) SET(rds_optimizer_options 0) */SELECT * FROM t1 WHERE t1.id = 1;$$);

    返回信息如下:

    -[ RECORD 1 ]---------+------------------------------------------------
    id                    | 2
    norm_query_string     | select * from t1 where t1.id = $1;
    application_name      |
    hints                 | set(optimizer on) set(rds_optimizer_options 0)
    query_hash            | -8281826471521807124
    enable                | t
    prepare_param_strings | {}
  8. 输出执行计划,语句如下:

    EXPLAIN SELECT * FROM t1 WHERE t1.id = 2;

    强制启用后,执行计划使用了ORCA优化器,返回信息如下:

                                     QUERY PLAN
    ----------------------------------------------------------------------------
     Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..6.00 rows=1 width=8)
       ->  Index Scan using t1_pkey on t1  (cost=0.00..6.00 rows=1 width=8)
             Index Cond: (id = 1)
     Optimizer: Pivotal Optimizer (GPORCA) version 3.86.0
    (4 rows)

干预JOIN顺序和JOIN类型

  1. 开启Hint干预计划并启用Hint注册功能,语句如下:

    SET optimizer to off;
    SET pg_hint_plan.enable_hint to on;
    SET pg_hint_plan.enable_hint_table to on;
  2. 创建用于测试的表t1和表t2,建表语句如下:

    CREATE TABLE t1 (id int PRIMARY KEY, val int);
    CREATE TABLE t2 (id int PRIMARY KEY, val int);
  3. 分别往两张表中插入测试数据,语句如下:

    INSERT INTO t1 SELECT i, i % 100 FROM (SELECT generate_series(1, 10000) i) t;
    INSERT INTO t2 SELECT i, i % 10 FROM (SELECT generate_series(1, 1000) i) t;
  4. 创建索引,语句如下:

    CREATE INDEX t1_val ON t1 (val);
    CREATE INDEX t2_val ON t2 (val);
  5. 输出执行计划,语句如下:

    EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;

    此处为默认执行计划,JOIN类型和JOIN顺序为HashJoin(t1 t2) Leading((t1 t2)),返回信息如下:

                                        QUERY PLAN
    ----------------------------------------------------------------------------------
     Gather Motion 3:1  (slice1; segments: 3)  (cost=25.50..208.00 rows=800 width=16)
       ->  Hash Join  (cost=25.50..208.00 rows=267 width=16)
             Hash Cond: (t1.id = t2.id)
             ->  Seq Scan on t1  (cost=0.00..137.00 rows=3334 width=8)
                   Filter: (val < 100)
             ->  Hash  (cost=15.50..15.50 rows=267 width=8)
                   ->  Seq Scan on t2  (cost=0.00..15.50 rows=267 width=8)
                         Filter: (val > 1)
     Optimizer: Postgres query optimizer
  6. 以下几种方法可以干预JOIN顺序和JOIN类型:

    • 在输出执行计划的语句前添加Hint,语句如下:

      /*+ MergeJoin(t1 t2) Leading((t2 t1)) */ EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;

      JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:

                                          QUERY PLAN
      -----------------------------------------------------------------------------------
       Gather Motion 3:1  (slice1; segments: 3)  (cost=0.31..93.54 rows=800 width=16)
         ->  Merge Join  (cost=0.31..93.54 rows=267 width=16)
               Merge Cond: (t2.id = t1.id)
               ->  Index Scan using t2_pkey on t2  (cost=0.15..52.90 rows=267 width=8)
                     Filter: (val > 1)
               ->  Index Scan using t1_pkey on t1  (cost=0.16..279.91 rows=3334 width=8)
                     Filter: (val < 100)
       Optimizer: Postgres query optimizer
      (8 rows)
    • 先注册Hint再输出执行计划:

      1. 注册SQL语句模板,语句如下:

        SELECT * FROM hint_plan.insert_hint_table($$/*+ MergeJoin(t1 t2) Leading((t2 t1)) */SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);

        返回信息如下:

        -[ RECORD 1 ]---------+--------------------------------------------------------------------------
        id                    | 1
        norm_query_string     | SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < $1 and t2.val > $2;
        application_name      |
        hints                 | MergeJoin(t1 t2) Leading((t2 t1))
        query_hash            | -4733464863014584191
        enable                | t
        prepare_param_strings | {}
      2. 输出执行计划,语句如下:

        EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;

        Hint已生效,JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:

                                            QUERY PLAN
        -----------------------------------------------------------------------------------
         Gather Motion 3:1  (slice1; segments: 3)  (cost=0.31..93.54 rows=800 width=16)
           ->  Merge Join  (cost=0.31..93.54 rows=267 width=16)
                 Merge Cond: (t2.id = t1.id)
                 ->  Index Scan using t2_pkey on t2  (cost=0.15..52.90 rows=267 width=8)
                       Filter: (val > 1)
                 ->  Index Scan using t1_pkey on t1  (cost=0.16..279.91 rows=3334 width=8)
                       Filter: (val < 100)
         Optimizer: Postgres query optimizer
        (8 rows)
      3. 执行相同SQL语句模板的查询计划,语句如下:

        EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 80 and t2.val > 20;

        返回信息如下,可以看到Hint也可以对同SQL语句模板的其它SQL语句生效:

                                            QUERY PLAN
        -----------------------------------------------------------------------------------
         Gather Motion 3:1  (slice1; segments: 3)  (cost=8.25..38.24 rows=13 width=16)
           ->  Merge Join  (cost=8.25..38.24 rows=5 width=16)
                 Merge Cond: (t1.id = t2.id)
                 ->  Index Scan using t1_pkey on t1  (cost=0.16..279.91 rows=2667 width=8)
                       Filter: (val < 80)
                 ->  Sort  (cost=8.09..8.09 rows=1 width=8)
                       Sort Key: t2.id
                       ->  Index Scan using t2_val on t2  (cost=0.15..8.08 rows=1 width=8)
                             Index Cond: (val > 20)
         Optimizer: Postgres query optimizer
        (10 rows)
      4. 可选:删除所有该参数化语句的Hint,语句如下:

        SELECT * FROM hint_plan.delete_all_hint_table($$SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);
    • 语句中部分常数固定,注册Hint后执行计划:

      1. 使用PREPARE语句注册SQL语句模板,语句如下:

        SELECT * FROM hint_plan.insert_hint_table($$/*+ MergeJoin(t1 t2) Leading((t2 t1)) */PREPARE a AS SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > $1;$$);

        返回信息如下:

        -[ RECORD 1 ]---------+--------------------------------------------------------------------------
        id                    | 1
        norm_query_string     | PREPARE a AS SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < $1 and t2.val > $2;
        application_name      |
        hints                 | MergeJoin(t1 t2) Leading((t2 t1))
        query_hash            | -4733464863014584191
        enable                | t
        prepare_param_strings | {100,$1}
      2. 执行变更了参数化常数的执行计划,语句如下:

        EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 2;

        参数化常数变化后,Hint依然可以生效,JOIN类型由HashJoin转变为MergeJoin,JOIN顺序由(t1,t2)转变为(t2,t1),返回信息如下:

                                            QUERY PLAN
        -----------------------------------------------------------------------------------
         Gather Motion 3:1  (slice1; segments: 3)  (cost=0.31..93.54 rows=800 width=16)
           ->  Merge Join  (cost=0.31..93.54 rows=267 width=16)
                 Merge Cond: (t1.id = t2.id)
                 ->  Index Scan using t1_pkey on t1  (cost=0.16..279.91 rows=3334 width=8)
                       Filter: (val < 100)
                 ->  Index Scan using t2_pkey on t2  (cost=0.15..52.90 rows=267 width=8)
                       Filter: (val > 2)
         Optimizer: Postgres query optimizer
        (8 rows)
      3. 执行变更了固定常数的执行计划,语句如下:

        EXPLAIN SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 80 and t2.val > 20;

        固定常数变化后,Hint失效,不再干预执行计划,返回信息如下:

                                            QUERY PLAN
        -----------------------------------------------------------------------------------
         Gather Motion 3:1  (slice1; segments: 3)  (cost=8.09..175.22 rows=13 width=16)
           ->  Hash Join  (cost=8.09..175.22 rows=5 width=16)
                 Hash Cond: (t1.id = t2.id)
                 ->  Seq Scan on t1  (cost=0.00..137.00 rows=2667 width=8)
                       Filter: (val < 80)
                 ->  Hash  (cost=8.08..8.08 rows=1 width=8)
                       ->  Index Scan using t2_val on t2  (cost=0.15..8.08 rows=1 width=8)
                             Index Cond: (val > 20)
         Optimizer: Postgres query optimizer
        (9 rows)
      4. 可选:删除所有该参数化语句的Hint,语句如下:

        SELECT * FROM hint_plan.delete_all_hint_table($$SELECT * FROM t1, t2 WHERE t1.id = t2.id and t1.val < 100 and t2.val > 1;$$);