本文以TPC-H为例,为您介绍并行查询使用示例。
本文的TPC-H的实现基于TPC-H的基准测试,并不能与已发布的TPC-H基准测试结果相比较,本文中的测试并不符合TPC-H基准测试的所有要求。
测试设计
测试用数据量:100 GB(Scalar Factor = 100)
测试用PolarDB MySQL版8.0版本的集群:节点规格为88核710 GB,在主节点上进行测试。
GROUP BY和ORDER BY支持
原始SQL语句如下所示:
SELECT l_returnflag,
l_linestatus,
Sum(l_quantity) AS sum_qty,
Sum(l_extendedprice) AS sum_base_price,
Sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
Avg(l_quantity) AS avg_qty,
Avg(l_extendedprice) AS avg_price,
Avg(l_discount) AS avg_disc,
Count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '93' day
GROUP BY l_returnflag,
l_linestatus
ORDER BY l_returnflag,
l_linestatus ;
未开启并行查询,耗时1563.32秒。
开启并行查询后,耗时只用49.65秒,提升31.48 倍。
AGGREGATE函数支持(SUM/AVG/COUNT)
原始SQL语句,如下所示:
SELECT l_returnflag,
l_linestatus,
Sum(l_quantity) AS sum_qty,
Sum(l_extendedprice) AS sum_base_price,
Sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
Avg(l_quantity) AS avg_qty,
Avg(l_extendedprice) AS avg_price,
Avg(l_discount) AS avg_disc,
Count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '93' day
GROUP BY l_returnflag,
l_linestatus
ORDER BY l_returnflag,
l_linestatus ;
未开启并行查询,耗时1563.32秒。
开启并行查询后,耗时只用49.65秒,提升31.48 倍。
JOIN支持
原始SQL语句,如下所示:
select sum(l_extendedprice* (1 - l_discount)) as revenue
from lineitem, part
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 6 and l_quantity <= 6 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' )
or ( p_partkey = l_partkey and p_brand = 'Brand#13'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 10 and l_quantity <= 10 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' )
or ( p_partkey = l_partkey and p_brand = 'Brand#24'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity >= 21 and l_quantity <= 21 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' );
未开启并行查询,耗时21.73秒。
开启并行查询后,耗时1.37秒,提升15.86倍。
BETWEEN函数和IN函数支持
原始SQL语句,如下所示:
select sum(l_extendedprice* (1 - l_discount)) as revenue
from lineitem, part
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 6 and l_quantity <= 6 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' )
or ( p_partkey = l_partkey and p_brand = 'Brand#13'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 10 and l_quantity <= 10 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' )
or ( p_partkey = l_partkey and p_brand = 'Brand#24'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity >= 21 and l_quantity <= 21 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON' );
未开启并行查询,耗时21.73秒。
开启并行查询后,耗时1.37秒,提升15.86倍。
LIMIT支持
原始SQL语句,如下所示:
select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1
else 0
end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1
else 0
end) as low_line_count
from orders, lineitem
where o_orderkey = l_orderkey
and l_shipmode in ('MAIL', 'TRUCK')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1996-01-01'
and l_receiptdate < date '1996-01-01' + interval '1' year
group by l_shipmode
order by l_shipmode limit 10;
未开启并行查询,耗时339.22 秒。
开启并行查询后,耗时29.31秒,提升11.57倍。
INTERVAL函数支持
原始SQL语句,如下所示:
select
100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount)
else 0
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from lineitem, part
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01'
and l_shipdate < date '1996-01-01' + interval '1' month limit 10;
未开启并行查询,耗时220.87秒。
开启并行查询后,耗时7.75秒,提升28.5倍。
CASE WHEN支持
原始SQL语句,如下所示:
select
100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount)
else 0
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from lineitem, part
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01'
and l_shipdate < date '1996-01-01' + interval '1' month limit 10;
未开启并行查询,耗时220.87秒。
开启并行查询后,耗时7.75秒,提升28.5倍。
LIKE支持
原始SQL语句,如下所示:
select s_name, s_address from
supplier, nation where
s_suppkey in
( select ps_suppkey from partsupp where
ps_partkey in ( select p_partkey from part where p_name like 'dark%')
and ps_availqty>(select 0.0005 * sum(l_quantity) as col1
from lineitem, partsupp
where l_partkey = ps_partkey and l_suppkey = ps_suppkey
and l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + interval '1' year)
)
and s_nationkey = n_nationkey and n_name = 'JORDAN'
order by s_name limit 10;
未开启并行查询,耗时427.46秒。
开启并行查询后,耗时33.72秒,提升12.68倍。
SUBQUERY支持
原始SQL语句,如下所示:
select
s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 35
and p_type like '%STEEL'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
and ps_supplycost = (
select
min(ps_supplycost)
from
partsupp,
supplier,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'AMERICA'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey;
limit 1;
未开启并行查询,耗时9.27秒。
开启并行查询后,耗时1.12秒,提升8.28倍。
GROUP BY WITH ROLLUP支持
GROUP BY WITH ROLLUP详细介绍请参见MySQL-with rollup函数运用和MySQL ROLLUP。
原始SQL语句,如下所示:
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date_sub('1998-12-01', interval ':1' day)
group by
l_returnflag,
l_linestatus
with rollup
order by
l_returnflag,
l_linestatus;
未开启并行查询,耗时318.73秒。
开启并行查询后,耗时22.30秒,提升14.29倍。
INSERT ... SELECT和REPLACE ... SELECT支持
原始SQL语句,如下所示:
insert into line_item_ap
SELECT l_returnflag,
l_linestatus,
Sum(l_quantity) AS sum_qty,
Sum(l_extendedprice) AS sum_base_price,
Sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
Avg(l_quantity) AS avg_qty,
Avg(l_extendedprice) AS avg_price,
Avg(l_discount) AS avg_disc,
Count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '93' day
GROUP BY l_returnflag,
l_linestatus
ORDER BY l_returnflag,
l_linestatus ;
未开启并行查询,耗时182.82秒。
开启并行查询后,耗时23.25秒,提升7.86倍。