本文将对AnalyticDB PostgreSQL版Serverless模式数据共享源端和目标端的查询性能进行测试。
说明 本文的TPC-H的实现基于TPC-H的基准测试,并不能与已发布的TPC-H基准测试结果相比较,本文中的测试并不符合TPC-H基准测试的所有要求。
测试说明
数据共享源端和目标端实例规格一致,实例具体信息如下:
- Segment节点规格:4C16G
- Segment节点数量:4个
- 地域及可用区:新加坡(可用区A)
- 内核版本:V1.0.1.0
创建实例的具体操作,请参见创建实例。
本次测试将使用dbgen工具生成100 GB原始数据,如何安装dbgen并导入数据,请参见生成测试数据。
测试步骤
- 将源端实例和目标端实例加入数据共享,具体操作,请参见开启实例的数据共享。
- 连接源端实例,进行以下操作:
- 连接实例,具体操作,请参见客户端连接。
说明 本次性能测试使用的客户端为psql。
- 创建一个名为db01的数据库,并切换到db01数据库,语句如下:
CREATE DATABASE db01; \c db01
- 查询db01的UUID,语句如下:
SELECT current_database_uuid();
- 创建一个名为tpch的Schema,并将其设置为默认Schema,语句如下:
CREATE SCHEMA IF NOT EXISTS tpch; SET search_path = tpch;
- 创建TPC-H的八张测试表,语句如下:
CREATE TABLE customer ( c_custkey integer NOT NULL, c_name character varying(25) NOT NULL, c_address character varying(40) NOT NULL, c_nationkey integer NOT NULL, c_phone character(15) NOT NULL, c_acctbal numeric(15,2) NOT NULL, c_mktsegment character(10) NOT NULL, c_comment character varying(117) NOT NULL ) distributed by (c_custkey); CREATE TABLE lineitem ( l_orderkey bigint NOT NULL, l_partkey integer NOT NULL, l_suppkey integer NOT NULL, l_linenumber integer NOT NULL, l_quantity numeric(15,2) NOT NULL, l_extendedprice numeric(15,2) NOT NULL, l_discount numeric(15,2) NOT NULL, l_tax numeric(15,2) NOT NULL, l_returnflag character(1) NOT NULL, l_linestatus character(1) NOT NULL, l_shipdate date NOT NULL, l_commitdate date NOT NULL, l_receiptdate date NOT NULL, l_shipinstruct character(25) NOT NULL, l_shipmode char(10) NOT NULL, l_comment varchar(44) NOT NULL ) distributed by (l_orderkey); CREATE TABLE nation ( n_nationkey integer NOT NULL, n_name character(25) NOT NULL, n_regionkey integer NOT NULL, n_comment character varying(152) ) distributed by (n_nationkey); CREATE TABLE orders ( o_orderkey bigint NOT NULL, o_custkey integer NOT NULL, o_orderstatus character(1) NOT NULL, o_totalprice numeric(15,2) NOT NULL, o_orderdate date NOT NULL, o_orderpriority character(15) NOT NULL, o_clerk character(15) NOT NULL, o_shippriority integer NOT NULL, o_comment character varying(79) NOT NULL ) distributed by (o_orderkey); CREATE TABLE part ( p_partkey integer NOT NULL, p_name character varying(55) NOT NULL, p_mfgr character(25) NOT NULL, p_brand character(10) NOT NULL, p_type character varying(25) NOT NULL, p_size integer NOT NULL, p_container character(10) NOT NULL, p_retailprice numeric(15,2) NOT NULL, p_comment character varying(23) NOT NULL ) distributed by (p_partkey); CREATE TABLE partsupp ( ps_partkey integer NOT NULL, ps_suppkey integer NOT NULL, ps_availqty integer NOT NULL, ps_supplycost numeric(15,2) NOT NULL, ps_comment character varying(199) NOT NULL ) distributed by (ps_partkey); CREATE TABLE region ( r_regionkey integer NOT NULL, r_name character(25) NOT NULL, r_comment character varying(152) ) distributed by (r_regionkey); CREATE TABLE supplier ( s_suppkey integer NOT NULL, s_name character(25) NOT NULL, s_address character varying(40) NOT NULL, s_nationkey integer NOT NULL, s_phone character(15) NOT NULL, s_acctbal numeric(15,2) NOT NULL, s_comment character varying(101) NOT NULL ) distributed by (s_suppkey);
- 导入测试数据。您可以通过OSS外表或\COPY命令将数据导入到AnalyticDB PostgreSQL版Serverless模式,具体操作,请参见使用OSS外表高速导入OSS数据或使用\COPY命令导入本地数据。
以下示例语句为\COPY方式导入数据,请将
'/path/to/localfile'
替换为您测试数据所在的真实路径:\COPY customer FROM '/path/to/localfile'; \COPY lineitem FROM '/path/to/localfile'; \COPY nation FROM '/path/to/localfile'; \COPY orders FROM '/path/to/localfile'; \COPY part FROM '/path/to/localfile'; \COPY partsupp FROM '/path/to/localfile'; \COPY region FROM '/path/to/localfile'; \COPY supplier FROM '/path/to/localfile';
- 连接实例,具体操作,请参见客户端连接。
- 连接目标端实例,进行以下操作:
- 连接实例,具体操作,请参见客户端连接。
- 创建一个名为db02的数据库,并切换到db02数据库,语句如下:
CREATE DATABASE db02; \c db02
- 查询db02的UUID,语句如下:
SELECT current_database_uuid();
- 在源端实例上创建datashare,将测试表加入数据共享,并授权给db02,具体操作如下:
- 创建datashare,语句如下:
CREATE DATASHARE s01;
- 将八张测试表加入数据共享,语句如下:
ALTER DATASHARE s01 ADD TABLE tpch_col.supplier; ALTER DATASHARE s01 ADD TABLE tpch_col.region; ALTER DATASHARE s01 ADD TABLE tpch_col.partsupp; ALTER DATASHARE s01 ADD TABLE tpch_col.part; ALTER DATASHARE s01 ADD TABLE tpch_col.orders; ALTER DATASHARE s01 ADD TABLE tpch_col.nation; ALTER DATASHARE s01 ADD TABLE tpch_col.lineitem; ALTER DATASHARE s01 ADD TABLE tpch_col.customer;
- 将datashare授权给目标端的db02,语句如下:
GRANT USAGE ON DATASHARE s01 TO DATABASE "db02-uuid";
说明 请将"db02-uuid"
替换为步骤二中实际获取到db02的UUID。
- 创建datashare,语句如下:
- 在目标端实例中导入数据共享s01,并执行ANALYZE收集统计信息,具体步骤如下:
- 导入数据共享s01,语句如下:
IMPORT DATASHARE s01 AS s01a FROM DATABASE "db01-uuid";
说明 请将"db01-uuid"
替换为步骤一中实际获取到db01的UUID。 - 对数据共享的八张表进行ANALYZE,语句如下:
ANALYZE customer; ANALYZE lineitem; ANALYZE nation; ANALYZE orders; ANALYZE part; ANALYZE partsupp; ANALYZE region; ANALYZE supplier;
- 导入数据共享s01,语句如下:
- 执行TPC-H的22条查询。
说明 进行测试前请将optimizer参数设置为off,如何修改配置参数,请参见参数配置。
-- Q1 select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-12-01' - interval '93 day' group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus; -- Q2 select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 23 and p_type like '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'EUROPE' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'EUROPE' ) order by s_acctbal desc, n_name, s_name, p_partkey limit 100; -- Q3 select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'MACHINERY' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-24' and l_shipdate > date '1995-03-24' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; -- Q4 select o_orderpriority, count(*) as order_count from orders where o_orderdate >= date '1996-08-01' and o_orderdate < date '1996-08-01' + interval '3' month and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority; -- Q6 select sum(l_extendedprice * l_discount) as revenue from lineitem where l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + interval '1' year and l_discount between 0.06 - 0.01 and 0.06 + 0.01 and l_quantity < 24; -- Q7 select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'JORDAN' and n2.n_name = 'INDONESIA') or (n1.n_name = 'INDONESIA' and n2.n_name = 'JORDAN') ) and l_shipdate between date '1995-01-01' and date '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year; -- Q8 select o_year, sum(case when nation = 'INDONESIA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from part, supplier, lineitem, orders, customer, nation n1, nation n2, region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'STANDARD BRUSHED BRASS' ) as all_nations group by o_year order by o_year; -- Q9 select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%chartreuse%' ) as profit group by nation, o_year order by nation, o_year desc; -- Q10 select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1994-08-01' and o_orderdate < date '1994-08-01' + interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc limit 20; -- Q11 select ps_partkey, sum(ps_supplycost * ps_availqty) as value from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'INDONESIA' group by ps_partkey having sum(ps_supplycost * ps_availqty) > ( select sum(ps_supplycost * ps_availqty) * 0.0001000000 from partsupp, supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = 'INDONESIA' ) order by value desc; -- Q12 select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_count from orders, lineitem where o_orderkey = l_orderkey and l_shipmode in ('REG AIR', 'TRUCK') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1994-01-01' and l_receiptdate < date '1994-01-01' + interval '1' year group by l_shipmode order by l_shipmode; -- Q13 select c_count, count(*) as custdist from ( select c_custkey, count(o_orderkey) from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%pending%requests%' group by c_custkey ) as c_orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; -- Q14 select 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) else 0 end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue from lineitem, part where l_partkey = p_partkey and l_shipdate >= date '1994-11-01' and l_shipdate < date '1994-11-01' + interval '1' month; -- Q15 create view revenue0 (supplier_no, total_revenue) as select l_suppkey, sum(l_extendedprice * (1 - l_discount)) from lineitem where l_shipdate >= date '1997-10-01' and l_shipdate < date '1997-10-01' + interval '3' month group by l_suppkey; select s_suppkey, s_name, s_address, s_phone, total_revenue from supplier, revenue0 where s_suppkey = supplier_no and total_revenue = ( select max(total_revenue) from revenue0 ) order by s_suppkey; drop view revenue0; -- Q16 select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#44' and p_type not like 'SMALL BURNISHED%' and p_size in (36, 27, 34, 45, 11, 6, 25, 16) and ps_suppkey not in ( select s_suppkey from supplier where s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size; -- Q17 select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#42' and p_container = 'JUMBO PACK' and l_quantity < ( select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey ); -- Q18 select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 312 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100; -- Q19 select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#43' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 5 and l_quantity <= 5 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#45' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 12 and l_quantity <= 12 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#11' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Q20 select s_name, s_address from supplier, nation where s_suppkey in ( select ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where p_name like 'magenta%' ) and ps_availqty > ( select 0.5 * sum(l_quantity) from lineitem where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= date '1996-01-01' and l_shipdate < date '1996-01-01' + interval '1' year ) ) and s_nationkey = n_nationkey and n_name = 'RUSSIA' order by s_name; -- Q21 select s_name, count(*) as numwait from supplier, lineitem l1, orders, nation where s_suppkey = l1.l_suppkey and o_orderkey = l1.l_orderkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from lineitem l2 where l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from lineitem l3 where l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'MOZAMBIQUE' group by s_name order by numwait desc, s_name limit 100; -- Q22 select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from ( select substring(c_phone from 1 for 2) as cntrycode, c_acctbal from customer where substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17') and c_acctbal > ( select avg(c_acctbal) from customer where c_acctbal > 0.00 and substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17') ) and not exists ( select * from orders where o_custkey = c_custkey ) ) as custsale group by cntrycode order by cntrycode;
测试结果
查询 | 源端查询耗时(单位:秒) | 目标端查询耗时(单位:秒) |
---|---|---|
Q1 | 287.04 | 291.46 |
Q2 | 18.49 | 20.14 |
Q3 | 143.08 | 169.46 |
Q4 | 61.54 | 72.78 |
Q5 | 105.46 | 152.77 |
Q6 | 23.78 | 32.56 |
Q7 | 84.42 | 96.63 |
Q8 | 77.01 | 87.33 |
Q9 | 329.42 | 340.8 |
Q10 | 81.72 | 89.85 |
Q11 | 18.18 | 18.24 |
Q12 | 62.93 | 70.79 |
Q13 | 141.13 | 146.47 |
Q14 | 29.35 | 38.33 |
Q15 | 56.76 | 74.08 |
Q16 | 20.47 | 20.27 |
Q17 | 944.35 | 960.16 |
Q18 | 228.83 | 256.7 |
Q19 | 57.03 | 65.63 |
Q20 | 192.67 | 199.1 |
Q21 | 289.82 | 303.06 |
Q22 | 48.86 | 57.07 |
总时间 | 3302.32 | 3563.69 |
结论
在TPC-H 100 GB测试场景中,数据共享的目标端查询性能能够达到源端的90%以上。