文档

IN谓词转JOIN

更新时间:

PolarDB支持IN谓词转JOIN功能。对于满足前提条件的复杂查询,通过该功能优化器可以将某些大的IN谓词转换为JOIN,从而提升复杂查询的执行性能。

前提条件

  • 集群版本需为PolarDB MySQL版8.0版本且修订版本需为8.0.2.2.10或以上。如何查看集群版本,请参见查询版本号

  • IN列表中的元素个数超过loose_in_predicate_conversion_threshold参数设置的个数。

  • [NOT]IN条件位于WHEREON子句的顶层。

使用方法

您可以通过loose_in_predicate_conversion_threshold参数设置IN谓词转JOIN功能。具体操作请参见设置集群参数和节点参数

参数名称

级别

描述

loose_in_predicate_conversion_threshold

Global

IN谓词转JOIN功能控制开关。

当SQL语句的IN列表中的元素个数大于或等于该参数值时,SQL语句进行转换,将IN谓词转换为JOIN。取值范围:0~18446744073709551615。默认值为5000。

说明

当该参数设置为0时,表示关闭该功能。

示例

原查询:

mysql> EXPLAIN  SELECT * FROM t WHERE a IN (1,2,3,5,5);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  160 |    50.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN format=tree SELECT * FROM t WHERE a IN (1,2,3,5,5);
+------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                              |
+------------------------------------------------------------------------------------------------------+
| -> Filter: (t.a IN (1,2,3,5,5))  (cost=16.25 rows=80)
    -> TABLE scan ON t  (cost=16.25 rows=160)
 |
+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

转换后的查询:

mysql> SET in_predicate_conversion_threshold=5;
mysql> EXPLAIN  SELECT * FROM t WHERE a IN (1,2,3,5,5);
+----+-------------+------------+------------+--------+---------------------+---------------------+---------+----------+------+----------+--------------------------+
| id | select_type | table      | partitions | type   | possible_keys       | key                 | key_len | ref      | rows | filtered | Extra                    |
+----+-------------+------------+------------+--------+---------------------+---------------------+---------+----------+------+----------+--------------------------+
|  1 | PRIMARY     | t          | NULL       | ALL    | NULL                | NULL                | NULL    | NULL     |  160 |   100.00 | Using where              |
|  1 | PRIMARY     | <derived3> | NULL       | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 8       | test.t.a |    1 |   100.00 | Using where; Using index |
|  3 | DERIVED     | NULL       | NULL       | NULL   | NULL                | NULL                | NULL    | NULL     | NULL |     NULL | IN-list Converted        |
+----+-------------+------------+------------+--------+---------------------+---------------------+---------+----------+------+----------+--------------------------+
mysql> EXPLAIN format=tree SELECT * FROM t1 WHERE a IN (1,2,3,5,5);
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                    |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop semijoin
    -> Filter: (t1.a IS NOT NULL)  (cost=0.55 rows=3)
        -> TABLE scan ON t1  (cost=0.55 rows=3)
    -> Filter: (t1.a = tvc_0._col_1)
        -> Index lookup ON tvc_0 using <auto_key0> (_col_1=t1.a)
            -> Materialize
                -> scan ON in-list: 5 rows

性能测试

在TPC-H 100 GB测试数据集下,分别开启和关闭IN谓词转JOIN功能并执行以下SQL:

SELECT sum(l_extendedprice) / 7.0 AS avg_yearly
FROM lineitem
WHERE l_partkey IN (9628136, 19958441, 10528766, ......); #IN表达式含有10w常量值

性能对比如下图所示:

性能对比

通过上图可以发现,开启IN谓词转JOIN功能,进行IN谓词转换后,查询性能提升了18.9倍。