全部产品

子查询

更新时间:2019-01-17 19:45:15

与 MySQL 子查询的差异

在 MySQL 支持的子查询基础上,增加了以下限制,使用时需要注意。

  • 支持 WHERE 条件中的子查询以及列子查询,不支持HAVING 子句中的子查询JOIN ON 条件中的子查询
  • 等号操作行符的标量子查询(The Subquery as Scalar Operand)不支持 ROW 语法。例如:
    1. 支持:select * from tb1 where id in (select id from tb2)
    2. 支持:select * from tb1 where (id, name) in (select id, name from tb2)
    3. 支持:select * from tb1 where row(id, name) in (select id, name from tb2)
    4. 支持:select * from tb1 where row(id, name) not in (select id, name from tb2)
    5. 不支持:select * from tb1 where row(id, name) = (select id, name from tb2)
  • 目前仅支持 SELECT 语句中的子查询。 D​ELETE 语句中的子查询,目前不支持。例如:
    1. DELETE FROM t1 WHERE ROW(c1,c2) IN (
    2. SELECT c1, c2 FROM t2
    3. );

其它 INSERT, UPDATE, SET 中的子查询皆不支持。此处不再举出示例。

效率

DRDS 子查询部分只能转化为比较低效的 APPLY 执行器执行,在实际使用中请尽量避免以下类型的子查询。

  • WHERE 条件中 OR 与子查询共存时,执行效率会依外表数据情况大幅降低。例如:

    1. 高效:select * from tb1 where id in (select id from tb2)
    2. 高效:select * from tb1 where id in (select id from tb2) and id>3
    3. 低效:select * from tb1 where id in (select id from tb2) or id>3
  • 关联子查询(Correlated Subqueries)的关联项中带函数非等号运算符。例如:

    1. 高效:select * from tb1 a where id in
    2. (select id from tb2 b where a.name=b.name)
    3. 低效:select * from tb1 a where id in
    4. (select id from tb2 b where UPPER(a.name)=b.name)
    5. 低效:select * from tb1 a where id in
    6. (select id from tb2 b where a.decimal_test=abs(b.decimal_test))
    7. 低效:select * from tb1 a where id in
    8. (select id from tb2 b where a.name!=b.name)
    9. 低效:select * from tb1 a where id in
    10. (select id from tb2 b where a.name>=b.name)
  • 关联子查询(Correlated Subqueries)关联项与其它条件的逻辑运算符为 OR。例如:

    1. 高效:select * from tb1 a where id in
    2. (select id from tb2 b where a.name=b.name
    3. and b.date_test<'2015-12-02')
    4. 低效:select * from tb1 a where id in
    5. (select id from tb2 b where a.name=b.name
    6. or b.date_test<'2015-12-02')
    7. 低效:select * from tb1 a where id in
    8. (select id from tb2 b where a.name=b.name
    9. or b.date_test=a.date_test)
  • 标量子查询(The Subquery as Scalar Operand)带关联项。例如:

    1. 高效:select * from tb1 a where id >
    2. (select id from tb2 b where b.date_test<'2015-12-02')
    3. 低效:select * from tb1 a where id >
    4. (select id from tb2 b where a.name=b.name
    5. and b.date_test<'2015-12-02')
  • 跨嵌套层的子查询关联项。例如:

    • SQL 多层嵌套,但每层子查询关联项仅与次级上层关联,此类高效。

      1. 高效:select * from tb1 a where id in(select id from tb2 b
      2. where a.name=b.name and
      3. exists (select name from tb3 c where b.address=c.address))
    • SQL 多层嵌套,但是表 c 的子查询关联项中与表 a 的列进行了关联,此类低效。

      1. 低效:select * from tb1 a where id in(select id from tb2 b
      2. where a.name=b.name and
      3. exists (select name from tb3 c where a.address=c.address))
  • 子查询中包含 GROUP BY,请确保 GROUP BY 的分组列包含关联项。例如:

    • SQL 子查询中包含聚合函数和关联项,关联项 b.pk 被包含中分组列(pk)中,此类 SQL 高效。

      1. 高效:select * from tb1 a where exists
      2. (select pk from tb2 b
      3. where a.pk=b.pk and b.date_test='2003-04-05'
      4. group by pk);
    • SQL 子查询中包含聚合函数和关联项,关联项 b.date_test 不被包含中分组列(pk)中,此类 SQL 低效。

      1. 低效:select * from tb1 a where exists
      2. (select pk from tb2 b
      3. where a.date_test=b.date_test and b.date_test='2003-04-05'
      4. group by pk);

目前支持的子查询类别

Comparisons Using Subqueries

最常见的子查询类别,语法为:

  1. non_subquery_operand comparison_operator (subquery)
  2. comparison_operator = > < >= <= <> != <=> like

例如:

  1. ... WHERE 'a' = (SELECT column1 FROM t1)

注意:目前仅支持子查询在比较运算符的右边。

Subqueries with ANY, ALL, IN/NOT IN, EXISTS/NOT EXISTS

语法:

  1. operand comparison_operator ANY (subquery)
  2. operand comparison_operator ALL (subquery)
  3. operand IN (subquery)
  4. operand NOT IN (subquery)
  5. operand EXISTS (subquery)
  6. operand NOT EXISTS (subquery)
  7. comparison_operator:= > < >= <= <> !=

ANY : 如果子查询返回的任意一行满足 ANY 前的表达式,返回 TRUE,否则返回 FALSE。ALL:如果子查询返回所有行都满足 ALL 前的表达式,返回 TRUE,否则返回 FALSE。IN : IN 在子查询前使用时,与 =ANY 是等价的。例如:

  1. SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2);
  2. SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);

NOT IN:NOT IN 在子查询前使用时,与 <>ALL 是等价的。例如:

  1. SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2);
  2. SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);

EXISTS/NOT EXISTS:如果子查询返回任意行,EXISTS 子查询结果为 TRUE,NOT EXISTS 子查询结果为 FALSE。如果返回为空,则结果反过来。例如:

  1. SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

如果子查询中包含任意行,即使只包含 NULL 的行值,WHERE 条件也会返回 TRUE。

Row Subqueries

Row Subqueries 支持以下表达式:

  1. = > < >= <= <> != <=>

例如:

  1. SELECT * FROM t1
  2. WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
  3. SELECT * FROM t1
  4. WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

以上两个 SQL 是等价的。只有满足以下条件时,t1 表的数据行才会返回:

  • 子查询 (SELECT col3, col4 FROM t2 WHERE id = 10 ) 返回的仅有一行记录。返回多行会报错。
  • 子查询返回的 col3, col4 与主表的 col1, col2 相等。

Correlated Subqueries

Correlated Subqueries 是指子查询中包含对外层查询表的引用。例如:

  1. SELECT * FROM t1
  2. WHERE column1 = ANY (SELECT column1 FROM t2
  3. WHERE t2.column2 = t1.column2);

示例 SQL 子查询中并没有包含 t1 表及其列名 column2,此时会向上一层寻找 t1 表的引用。

注意:将 Correlated Subqueries 改写为 JOIN 有可能会提高其性能。

Derived Tables (Subqueries in the FROM Clause)

Derived Tables 是指在 FROM 子句中的子查询:

  1. SELECT ... FROM (subquery) [AS] tbl_name ...

例如:

  1. CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);
  2. INSERT INTO t1 VALUES (1,'1',1.0);
  3. INSERT INTO t1 VALUES (2,'2',2.0);
  4. SELECT sb1,sb2,sb3
  5. FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
  6. WHERE sb1 > 1;

查询的结果为:2, '2', 4

从场景举例:假如现在需要分组数据 SUM 后的平均值,直接使用以下 SQL 无法得到想要的结果。

  1. SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;

此时可使用 Derived Tables 拿到需要的信息:

  1. SELECT AVG(sum_column1)
  2. FROM (SELECT SUM(column1) AS sum_column1
  3. FROM t1 GROUP BY column1) AS t1;

注意:

  • Derived Tables 必须拥有一个别名
  • Derived Tables 可以返回一个标量,列,行或表
  • Derived Tables 不可以成为 Correlated Subqueries,即不能包含子查询外部表的引用。