并行查询使用示例

本文以TPC-H为例,为您介绍并行查询使用示例。

说明

本文的TPC-H的实现基于TPC-H的基准测试,并不能与已发布的TPC-H基准测试结果相比较,本文中的测试并不符合TPC-H基准测试的所有要求。

测试设计

  • 测试用数据量:100 GB(Scalar Factor = 100)

  • 测试用PolarDB MySQL版8.0版本的集群:节点规格为88核710 GB,在主节点上进行测试。

GROUP BY和ORDER BY支持

原始SQL语句如下所示:

SELECT   l_returnflag, 
         l_linestatus, 
         Sum(l_quantity)                                       AS sum_qty, 
         Sum(l_extendedprice)                                  AS sum_base_price, 
         Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price, 
         Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, 
         Avg(l_quantity)                                       AS avg_qty, 
         Avg(l_extendedprice)                                  AS avg_price, 
         Avg(l_discount)                                       AS avg_disc, 
         Count(*)                                              AS count_order 
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day 
GROUP BY l_returnflag, 
         l_linestatus 
ORDER BY l_returnflag, 
         l_linestatus ;
  • 未开启并行查询,耗时1563.32秒。不打开

  • 开启并行查询后,耗时只用49.65秒,提升31.48 倍。打开并行查询

AGGREGATE函数支持(SUM/AVG/COUNT)

原始SQL语句,如下所示:

SELECT   l_returnflag, 
         l_linestatus, 
         Sum(l_quantity)                                       AS sum_qty, 
         Sum(l_extendedprice)                                  AS sum_base_price, 
         Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price, 
         Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, 
         Avg(l_quantity)                                       AS avg_qty, 
         Avg(l_extendedprice)                                  AS avg_price, 
         Avg(l_discount)                                       AS avg_disc, 
         Count(*)                                              AS count_order 
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day 
GROUP BY l_returnflag, 
         l_linestatus 
ORDER BY l_returnflag, 
         l_linestatus ;
  • 未开启并行查询,耗时1563.32秒。off

  • 开启并行查询后,耗时只用49.65秒,提升31.48 倍。on

JOIN支持

原始SQL语句,如下所示:

select sum(l_extendedprice* (1 - l_discount)) as revenue 
from   lineitem,   part 
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') 
        and l_quantity >= 6 and l_quantity <= 6 + 10 
        and p_size between 1 and 5 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#13'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') 
        and l_quantity >= 10 and l_quantity <= 10 + 10 
        and p_size between 1 and 10 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#24'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') 
        and l_quantity >= 21 and l_quantity <= 21 + 10 
        and p_size between 1 and 15 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ); 
  • 未开启并行查询,耗时21.73秒。off

  • 开启并行查询后,耗时1.37秒,提升15.86倍。on

BETWEEN函数和IN函数支持

原始SQL语句,如下所示:

select sum(l_extendedprice* (1 - l_discount)) as revenue 
from   lineitem,   part 
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') 
        and l_quantity >= 6 and l_quantity <= 6 + 10 
        and p_size between 1 and 5 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#13'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') 
        and l_quantity >= 10 and l_quantity <= 10 + 10 
        and p_size between 1 and 10 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#24'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') 
        and l_quantity >= 21 and l_quantity <= 21 + 10 
        and p_size between 1 and 15 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ); 
  • 未开启并行查询,耗时21.73秒。off

  • 开启并行查询后,耗时1.37秒,提升15.86倍。on

LIMIT支持

原始SQL语句,如下所示:

select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 
    else 0 
end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 
else 0 
end) as low_line_count 
from   orders,   lineitem 
where o_orderkey = l_orderkey 
and l_shipmode in ('MAIL', 'TRUCK') 
and l_commitdate < l_receiptdate 
and l_shipdate < l_commitdate 
and l_receiptdate >= date '1996-01-01' 
and l_receiptdate < date '1996-01-01' + interval '1' year 
group by l_shipmode 
order by l_shipmode limit 10; 
  • 未开启并行查询,耗时339.22 秒。off

  • 开启并行查询后,耗时29.31秒,提升11.57倍。on

INTERVAL函数支持

原始SQL语句,如下所示:

select 
    100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) 
    else 0 
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue 
from   lineitem,   part 
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01' 
and l_shipdate < date '1996-01-01' + interval '1' month limit 10; 
  • 未开启并行查询,耗时220.87秒。off

  • 开启并行查询后,耗时7.75秒,提升28.5倍。on

CASE WHEN支持

原始SQL语句,如下所示:

select 
    100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) 
    else 0 
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue 
from   lineitem,   part 
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01' 
and l_shipdate < date '1996-01-01' + interval '1' month limit 10; 
  • 未开启并行查询,耗时220.87秒。off

  • 开启并行查询后,耗时7.75秒,提升28.5倍。on

LIKE支持

原始SQL语句,如下所示:

select s_name, s_address from
 supplier,  nation where
s_suppkey in 
    ( select ps_suppkey from  partsupp where
             ps_partkey in ( select p_partkey from  part where p_name like 'dark%')
            and ps_availqty>(select 0.0005 * sum(l_quantity) as col1
     from   lineitem,   partsupp
     where l_partkey = ps_partkey and l_suppkey = ps_suppkey
     and l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + interval '1' year)
    )
and s_nationkey = n_nationkey and n_name = 'JORDAN'
order by s_name limit 10; 
  • 未开启并行查询,耗时427.46秒。1

  • 开启并行查询后,耗时33.72秒,提升12.68倍。2

SUBQUERY支持

原始SQL语句,如下所示:

select
    s_acctbal,
    s_name,
    n_name,
    p_partkey,
    p_mfgr,
    s_address,
    s_phone,
    s_comment
from
    part,
    supplier,
    partsupp,
    nation,
    region
where
    p_partkey = ps_partkey
    and s_suppkey = ps_suppkey
    and p_size = 35
    and p_type like '%STEEL'
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'AMERICA'
    and ps_supplycost = (
        select
            min(ps_supplycost)
        from
            partsupp,
            supplier,
            nation,
            region
        where
            p_partkey = ps_partkey
            and s_suppkey = ps_suppkey
            and s_nationkey = n_nationkey
            and n_regionkey = r_regionkey
            and r_name = 'AMERICA'
    )
order by
    s_acctbal desc,
    n_name,
    s_name,
    p_partkey;
limit 1;
  • 未开启并行查询,耗时9.27秒。SUBQUERY支持-前

  • 开启并行查询后,耗时1.12秒,提升8.28倍。SUBQUERY支持-后

GROUP BY WITH ROLLUP支持

GROUP BY WITH ROLLUP详细介绍请参见MySQL-with rollup函数运用MySQL ROLLUP

原始SQL语句,如下所示:

select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate <= date_sub('1998-12-01', interval ':1' day)
group by
    l_returnflag,
    l_linestatus
with rollup
order by
    l_returnflag,
    l_linestatus;
  • 未开启并行查询,耗时318.73秒。GROUP BY WITH ROLLUP-前

  • 开启并行查询后,耗时22.30秒,提升14.29倍。GROUP BY WITH ROLLUP-后

INSERT ... SELECT和REPLACE ... SELECT支持

原始SQL语句,如下所示:

insert into line_item_ap
SELECT   l_returnflag, 
         l_linestatus, 
         Sum(l_quantity)                                       AS sum_qty, 
         Sum(l_extendedprice)                                  AS sum_base_price, 
         Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price, 
         Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, 
         Avg(l_quantity)                                       AS avg_qty, 
         Avg(l_extendedprice)                                  AS avg_price, 
         Avg(l_discount)                                       AS avg_disc, 
         Count(*)                                              AS count_order 
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day 
GROUP BY l_returnflag, 
         l_linestatus 
ORDER BY l_returnflag, 
         l_linestatus ;
  • 未开启并行查询,耗时182.82秒。INSERT ... SELECT/REPLACE ... SELECT-前

  • 开启并行查询后,耗时23.25秒,提升7.86倍。INSERT ... SELECT/REPLACE ... SELECT-后