本文将对AnalyticDB PostgreSQL版Serverless模式数据共享源端和目标端的查询性能进行测试。

说明 本文的TPC-H的实现基于TPC-H的基准测试,并不能与已发布的TPC-H基准测试结果相比较,本文中的测试并不符合TPC-H基准测试的所有要求。

测试说明

数据共享源端和目标端实例规格一致,实例具体信息如下:

  • Segment节点规格:4C16G
  • Segment节点数量:4个
  • 地域及可用区:新加坡(可用区A)
  • 内核版本:V1.0.1.0

创建实例的具体操作,请参见创建实例

本次测试将使用dbgen工具生成100 GB原始数据,如何安装dbgen并导入数据,请参见生成测试数据

测试步骤

  1. 将源端实例和目标端实例加入数据共享,具体操作,请参见开启实例的数据共享
  2. 连接源端实例,进行以下操作:
    1. 连接实例,具体操作,请参见客户端连接
      说明 本次性能测试使用的客户端为psql。
    2. 创建一个名为db01的数据库,并切换到db01数据库,语句如下:
      CREATE DATABASE db01;
      \c db01
    3. 查询db01的UUID,语句如下:
      SELECT current_database_uuid();
    4. 创建一个名为tpch的Schema,并将其设置为默认Schema,语句如下:
      CREATE SCHEMA IF NOT EXISTS tpch;
      SET search_path = tpch;
    5. 创建TPC-H的八张测试表,语句如下:
      CREATE TABLE customer (
          c_custkey integer NOT NULL,
          c_name character varying(25) NOT NULL,
          c_address character varying(40) NOT NULL,
          c_nationkey integer NOT NULL,
          c_phone character(15) NOT NULL,
          c_acctbal numeric(15,2) NOT NULL,
          c_mktsegment character(10) NOT NULL,
          c_comment character varying(117) NOT NULL
      )
      distributed by (c_custkey);
      
      CREATE TABLE lineitem (
          l_orderkey bigint NOT NULL,
          l_partkey integer NOT NULL,
          l_suppkey integer NOT NULL,
          l_linenumber integer NOT NULL,
          l_quantity numeric(15,2) NOT NULL,
          l_extendedprice numeric(15,2) NOT NULL,
          l_discount numeric(15,2) NOT NULL,
          l_tax numeric(15,2) NOT NULL,
          l_returnflag character(1) NOT NULL,
          l_linestatus character(1) NOT NULL,
          l_shipdate date NOT NULL,
          l_commitdate date NOT NULL,
          l_receiptdate date NOT NULL,
          l_shipinstruct character(25) NOT NULL,
          l_shipmode char(10) NOT NULL,
          l_comment varchar(44) NOT NULL
      )
      distributed by (l_orderkey);
      
      CREATE TABLE nation (
          n_nationkey integer NOT NULL,
          n_name character(25) NOT NULL,
          n_regionkey integer NOT NULL,
          n_comment character varying(152)
      )
      distributed by (n_nationkey);
      
      CREATE TABLE orders (
          o_orderkey bigint NOT NULL,
          o_custkey integer NOT NULL,
          o_orderstatus character(1) NOT NULL,
          o_totalprice numeric(15,2) NOT NULL,
          o_orderdate date NOT NULL,
          o_orderpriority character(15) NOT NULL,
          o_clerk character(15) NOT NULL,
          o_shippriority integer NOT NULL,
          o_comment character varying(79) NOT NULL
      )
      distributed by (o_orderkey);
      
      CREATE TABLE part (
          p_partkey integer NOT NULL,
          p_name character varying(55) NOT NULL,
          p_mfgr character(25) NOT NULL,
          p_brand character(10) NOT NULL,
          p_type character varying(25) NOT NULL,
          p_size integer NOT NULL,
          p_container character(10) NOT NULL,
          p_retailprice numeric(15,2) NOT NULL,
          p_comment character varying(23) NOT NULL
      )
      distributed by (p_partkey);
      
      CREATE TABLE partsupp (
          ps_partkey integer NOT NULL,
          ps_suppkey integer NOT NULL,
          ps_availqty integer NOT NULL,
          ps_supplycost numeric(15,2) NOT NULL,
          ps_comment character varying(199) NOT NULL
      )
      distributed by (ps_partkey);
      
      CREATE TABLE region (
          r_regionkey integer NOT NULL,
          r_name character(25) NOT NULL,
          r_comment character varying(152)
      )
      distributed by (r_regionkey);
      
      CREATE TABLE supplier (
          s_suppkey integer NOT NULL,
          s_name character(25) NOT NULL,
          s_address character varying(40) NOT NULL,
          s_nationkey integer NOT NULL,
          s_phone character(15) NOT NULL,
          s_acctbal numeric(15,2) NOT NULL,
          s_comment character varying(101) NOT NULL
      )
      distributed by (s_suppkey);
    6. 导入测试数据。您可以通过OSS外表或\COPY命令将数据导入到AnalyticDB PostgreSQL版Serverless模式,具体操作,请参见使用OSS外表高速导入OSS数据使用\COPY命令导入本地数据

      以下示例语句为\COPY方式导入数据,请将'/path/to/localfile'替换为您测试数据所在的真实路径:

      \COPY customer FROM '/path/to/localfile';
      \COPY lineitem FROM '/path/to/localfile';
      \COPY nation   FROM '/path/to/localfile';
      \COPY orders   FROM '/path/to/localfile';
      \COPY part     FROM '/path/to/localfile';
      \COPY partsupp FROM '/path/to/localfile';
      \COPY region   FROM '/path/to/localfile';
      \COPY supplier FROM '/path/to/localfile';
  3. 连接目标端实例,进行以下操作:
    1. 连接实例,具体操作,请参见客户端连接
    2. 创建一个名为db02的数据库,并切换到db02数据库,语句如下:
      CREATE DATABASE db02;
      \c db02
    3. 查询db02的UUID,语句如下:
      SELECT current_database_uuid();
  4. 在源端实例上创建datashare,将测试表加入数据共享,并授权给db02,具体操作如下:
    1. 创建datashare,语句如下:
      CREATE DATASHARE s01;
    2. 将八张测试表加入数据共享,语句如下:
      ALTER DATASHARE s01 ADD TABLE tpch_col.supplier;
      ALTER DATASHARE s01 ADD TABLE tpch_col.region;
      ALTER DATASHARE s01 ADD TABLE tpch_col.partsupp;
      ALTER DATASHARE s01 ADD TABLE tpch_col.part;
      ALTER DATASHARE s01 ADD TABLE tpch_col.orders;
      ALTER DATASHARE s01 ADD TABLE tpch_col.nation;
      ALTER DATASHARE s01 ADD TABLE tpch_col.lineitem;
      ALTER DATASHARE s01 ADD TABLE tpch_col.customer;
    3. 将datashare授权给目标端的db02,语句如下:
      GRANT USAGE ON DATASHARE s01 TO DATABASE "db02-uuid";
      说明 请将"db02-uuid"替换为步骤二中实际获取到db02的UUID。
  5. 在目标端实例中导入数据共享s01,并执行ANALYZE收集统计信息,具体步骤如下:
    1. 导入数据共享s01,语句如下:
      IMPORT DATASHARE s01 AS s01a FROM DATABASE "db01-uuid";
      说明 请将"db01-uuid"替换为步骤一中实际获取到db01的UUID。
    2. 对数据共享的八张表进行ANALYZE,语句如下:
      ANALYZE customer;
      ANALYZE lineitem;
      ANALYZE nation;
      ANALYZE orders;
      ANALYZE part;
      ANALYZE partsupp;
      ANALYZE region;
      ANALYZE supplier;
  6. 执行TPC-H的22条查询。
    说明 进行测试前请将optimizer参数设置为off,如何修改配置参数,请参见参数配置
    -- Q1
    select
        l_returnflag,
        l_linestatus,
        sum(l_quantity) as sum_qty,
        sum(l_extendedprice) as sum_base_price,
        sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
        sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
        avg(l_quantity) as avg_qty,
        avg(l_extendedprice) as avg_price,
        avg(l_discount) as avg_disc,
        count(*) as count_order
    from
        lineitem
    where
        l_shipdate <= date '1998-12-01' - interval '93 day'
    group by
        l_returnflag,
        l_linestatus
    order by
        l_returnflag,
        l_linestatus;
        
    -- Q2
    select
        s_acctbal,
        s_name,
        n_name,
        p_partkey,
        p_mfgr,
        s_address,
        s_phone,
        s_comment
    from
        part,
        supplier,
        partsupp,
        nation,
        region
    where
        p_partkey = ps_partkey
        and s_suppkey = ps_suppkey
        and p_size = 23
        and p_type like '%STEEL'
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and r_name = 'EUROPE'
        and ps_supplycost = (
            select
                min(ps_supplycost)
            from
                partsupp,
                supplier,
                nation,
                region
            where
                p_partkey = ps_partkey
                and s_suppkey = ps_suppkey
                and s_nationkey = n_nationkey
                and n_regionkey = r_regionkey
                and r_name = 'EUROPE'
        )
    order by
        s_acctbal desc,
        n_name,
        s_name,
        p_partkey
    limit 100;
    
    -- Q3
    select
        l_orderkey,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        o_orderdate,
        o_shippriority
    from
        customer,
        orders,
        lineitem
    where
        c_mktsegment = 'MACHINERY'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < date '1995-03-24'
        and l_shipdate > date '1995-03-24'
    group by
        l_orderkey,
        o_orderdate,
        o_shippriority
    order by
        revenue desc,
        o_orderdate
    limit 10;
    
    -- Q4
    select
        o_orderpriority,
        count(*) as order_count
    from
        orders
    where
        o_orderdate >= date '1996-08-01'
        and o_orderdate < date '1996-08-01' + interval '3' month
        and exists (
            select
                *
            from
                lineitem
            where
                l_orderkey = o_orderkey
                and l_commitdate < l_receiptdate
        )
    group by
        o_orderpriority
    order by
        o_orderpriority;
        
    -- Q6
    select
        sum(l_extendedprice * l_discount) as revenue
    from
        lineitem
    where
        l_shipdate >= date '1994-01-01'
        and l_shipdate < date '1994-01-01' + interval '1' year
        and l_discount between 0.06 - 0.01 and 0.06 + 0.01
        and l_quantity < 24;
        
    -- Q7
    select
        supp_nation,
        cust_nation,
        l_year,
        sum(volume) as revenue
    from
        (
            select
                n1.n_name as supp_nation,
                n2.n_name as cust_nation,
                extract(year from l_shipdate) as l_year,
                l_extendedprice * (1 - l_discount) as volume
            from
                supplier,
                lineitem,
                orders,
                customer,
                nation n1,
                nation n2
            where
                s_suppkey = l_suppkey
                and o_orderkey = l_orderkey
                and c_custkey = o_custkey
                and s_nationkey = n1.n_nationkey
                and c_nationkey = n2.n_nationkey
                and (
                    (n1.n_name = 'JORDAN' and n2.n_name = 'INDONESIA')
                    or (n1.n_name = 'INDONESIA' and n2.n_name = 'JORDAN')
                )
                and l_shipdate between date '1995-01-01' and date '1996-12-31'
        ) as shipping
    group by
        supp_nation,
        cust_nation,
        l_year
    order by
        supp_nation,
        cust_nation,
        l_year;
    
    -- Q8
    select
        o_year,
        sum(case
            when nation = 'INDONESIA' then volume
            else 0
        end) / sum(volume) as mkt_share
    from
        (
            select
                extract(year from o_orderdate) as o_year,
                l_extendedprice * (1 - l_discount) as volume,
                n2.n_name as nation
            from
                part,
                supplier,
                lineitem,
                orders,
                customer,
                nation n1,
                nation n2,
                region
            where
                p_partkey = l_partkey
                and s_suppkey = l_suppkey
                and l_orderkey = o_orderkey
                and o_custkey = c_custkey
                and c_nationkey = n1.n_nationkey
                and n1.n_regionkey = r_regionkey
                and r_name = 'ASIA'
                and s_nationkey = n2.n_nationkey
                and o_orderdate between date '1995-01-01' and date '1996-12-31'
                and p_type = 'STANDARD BRUSHED BRASS'
        ) as all_nations
    group by
        o_year
    order by
        o_year;
        
    -- Q9
    select
        nation,
        o_year,
        sum(amount) as sum_profit
    from
        (
            select
                n_name as nation,
                extract(year from o_orderdate) as o_year,
                l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
            from
                part,
                supplier,
                lineitem,
                partsupp,
                orders,
                nation
            where
                s_suppkey = l_suppkey
                and ps_suppkey = l_suppkey
                and ps_partkey = l_partkey
                and p_partkey = l_partkey
                and o_orderkey = l_orderkey
                and s_nationkey = n_nationkey
                and p_name like '%chartreuse%'
        ) as profit
    group by
        nation,
        o_year
    order by
        nation,
        o_year desc;
        
    -- Q10
    select
        c_custkey,
        c_name,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        c_acctbal,
        n_name,
        c_address,
        c_phone,
        c_comment
    from
        customer,
        orders,
        lineitem,
        nation
    where
        c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate >= date '1994-08-01'
        and o_orderdate < date '1994-08-01' + interval '3' month
        and l_returnflag = 'R'
        and c_nationkey = n_nationkey
    group by
        c_custkey,
        c_name,
        c_acctbal,
        c_phone,
        n_name,
        c_address,
        c_comment
    order by
        revenue desc
    limit 20;
    
    -- Q11
    select
        ps_partkey,
        sum(ps_supplycost * ps_availqty) as value
    from
        partsupp,
        supplier,
        nation
    where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'INDONESIA'
    group by
        ps_partkey having
            sum(ps_supplycost * ps_availqty) > (
                select
                    sum(ps_supplycost * ps_availqty) * 0.0001000000
                from
                    partsupp,
                    supplier,
                    nation
                where
                    ps_suppkey = s_suppkey
                    and s_nationkey = n_nationkey
                    and n_name = 'INDONESIA'
            )
    order by
        value desc;
    
    -- Q12
    select
        l_shipmode,
        sum(case
            when o_orderpriority = '1-URGENT'
                or o_orderpriority = '2-HIGH'
                then 1
            else 0
        end) as high_line_count,
        sum(case
            when o_orderpriority <> '1-URGENT'
                and o_orderpriority <> '2-HIGH'
                then 1
            else 0
        end) as low_line_count
    from
        orders,
        lineitem
    where
        o_orderkey = l_orderkey
        and l_shipmode in ('REG AIR', 'TRUCK')
        and l_commitdate < l_receiptdate
        and l_shipdate < l_commitdate
        and l_receiptdate >= date '1994-01-01'
        and l_receiptdate < date '1994-01-01' + interval '1' year
    group by
        l_shipmode
    order by
        l_shipmode;
        
    -- Q13
    select
        c_count,
        count(*) as custdist
    from
        (
            select
                c_custkey,
                count(o_orderkey)
            from
                customer left outer join orders on
                    c_custkey = o_custkey
                    and o_comment not like '%pending%requests%'
            group by
                c_custkey
        ) as c_orders (c_custkey, c_count)
    group by
        c_count
    order by
        custdist desc,
        c_count desc;
        
    -- Q14
    select
        100.00 * sum(case
            when p_type like 'PROMO%'
                then l_extendedprice * (1 - l_discount)
            else 0
        end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
    from
        lineitem,
        part
    where
        l_partkey = p_partkey
        and l_shipdate >= date '1994-11-01'
        and l_shipdate < date '1994-11-01' + interval '1' month;
        
    -- Q15
    create view revenue0 (supplier_no, total_revenue) as
        select
            l_suppkey,
            sum(l_extendedprice * (1 - l_discount))
        from
            lineitem
        where
            l_shipdate >= date '1997-10-01'
            and l_shipdate < date '1997-10-01' + interval '3' month
        group by
            l_suppkey;
    select
        s_suppkey,
        s_name,
        s_address,
        s_phone,
        total_revenue
    from
        supplier,
        revenue0
    where
        s_suppkey = supplier_no
        and total_revenue = (
            select
                max(total_revenue)
            from
                revenue0
        )
    order by
        s_suppkey;
    drop view revenue0;
    
    -- Q16
    select
        p_brand,
        p_type,
        p_size,
        count(distinct ps_suppkey) as supplier_cnt
    from
        partsupp,
        part
    where
        p_partkey = ps_partkey
        and p_brand <> 'Brand#44'
        and p_type not like 'SMALL BURNISHED%'
        and p_size in (36, 27, 34, 45, 11, 6, 25, 16)
        and ps_suppkey not in (
            select
                s_suppkey
            from
                supplier
            where
                s_comment like '%Customer%Complaints%'
        )
    group by
        p_brand,
        p_type,
        p_size
    order by
        supplier_cnt desc,
        p_brand,
        p_type,
        p_size;
    
    -- Q17
    select
        sum(l_extendedprice) / 7.0 as avg_yearly
    from
        lineitem,
        part
    where
        p_partkey = l_partkey
        and p_brand = 'Brand#42'
        and p_container = 'JUMBO PACK'
        and l_quantity < (
            select
                0.2 * avg(l_quantity)
            from
                lineitem
            where
                l_partkey = p_partkey
        );
        
    -- Q18
    select
        c_name,
        c_custkey,
        o_orderkey,
        o_orderdate,
        o_totalprice,
        sum(l_quantity)
    from
        customer,
        orders,
        lineitem
    where
        o_orderkey in (
            select
                l_orderkey
            from
                lineitem
            group by
                l_orderkey having
                    sum(l_quantity) > 312
        )
        and c_custkey = o_custkey
        and o_orderkey = l_orderkey
    group by
        c_name,
        c_custkey,
        o_orderkey,
        o_orderdate,
        o_totalprice
    order by
        o_totalprice desc,
        o_orderdate
    limit 100;
    
    -- Q19
    select
        sum(l_extendedprice* (1 - l_discount)) as revenue
    from
        lineitem,
        part
    where
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#43'
            and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
            and l_quantity >= 5 and l_quantity <= 5 + 10
            and p_size between 1 and 5
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#45'
            and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
            and l_quantity >= 12 and l_quantity <= 12 + 10
            and p_size between 1 and 10
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#11'
            and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
            and l_quantity >= 24 and l_quantity <= 24 + 10
            and p_size between 1 and 15
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        );
    
    -- Q20
    select
        s_name,
        s_address
    from
        supplier,
        nation
    where
        s_suppkey in (
            select
                ps_suppkey
            from
                partsupp
            where
                ps_partkey in (
                    select
                        p_partkey
                    from
                        part
                    where
                        p_name like 'magenta%'
                )
                and ps_availqty > (
                    select
                        0.5 * sum(l_quantity)
                    from
                        lineitem
                    where
                        l_partkey = ps_partkey
                        and l_suppkey = ps_suppkey
                        and l_shipdate >= date '1996-01-01'
                        and l_shipdate < date '1996-01-01' + interval '1' year
                )
        )
        and s_nationkey = n_nationkey
        and n_name = 'RUSSIA'
    order by
        s_name;
    
    -- Q21
    select
        s_name,
        count(*) as numwait
    from
        supplier,
        lineitem l1,
        orders,
        nation
    where
        s_suppkey = l1.l_suppkey
        and o_orderkey = l1.l_orderkey
        and o_orderstatus = 'F'
        and l1.l_receiptdate > l1.l_commitdate
        and exists (
            select
                *
            from
                lineitem l2
            where
                l2.l_orderkey = l1.l_orderkey
                and l2.l_suppkey <> l1.l_suppkey
        )
        and not exists (
            select
                *
            from
                lineitem l3
            where
                l3.l_orderkey = l1.l_orderkey
                and l3.l_suppkey <> l1.l_suppkey
                and l3.l_receiptdate > l3.l_commitdate
        )
        and s_nationkey = n_nationkey
        and n_name = 'MOZAMBIQUE'
    group by
        s_name
    order by
        numwait desc,
        s_name
    limit 100;
    
    -- Q22
    select
            cntrycode,
            count(*) as numcust,
            sum(c_acctbal) as totacctbal
    from
            (
                    select
                            substring(c_phone from 1 for 2) as cntrycode,
                            c_acctbal
                    from
                            customer
                    where
                            substring(c_phone from 1 for 2) in
                                    ('13', '31', '23', '29', '30', '18', '17')
                            and c_acctbal > (
                                    select
                                            avg(c_acctbal)
                                    from
                                            customer
                                    where
                                            c_acctbal > 0.00
                                            and substring(c_phone from 1 for 2) in
                                                    ('13', '31', '23', '29', '30', '18', '17')
                            )
                            and not exists (
                                    select
                                            *
                                    from
                                            orders
                                    where
                                            o_custkey = c_custkey
                            )
            ) as custsale
    group by
            cntrycode
    order by
            cntrycode;

测试结果

查询 源端查询耗时(单位:秒) 目标端查询耗时(单位:秒)
Q1 287.04 291.46
Q2 18.49 20.14
Q3 143.08 169.46
Q4 61.54 72.78
Q5 105.46 152.77
Q6 23.78 32.56
Q7 84.42 96.63
Q8 77.01 87.33
Q9 329.42 340.8
Q10 81.72 89.85
Q11 18.18 18.24
Q12 62.93 70.79
Q13 141.13 146.47
Q14 29.35 38.33
Q15 56.76 74.08
Q16 20.47 20.27
Q17 944.35 960.16
Q18 228.83 256.7
Q19 57.03 65.63
Q20 192.67 199.1
Q21 289.82 303.06
Q22 48.86 57.07
总时间 3302.32 3563.69
数据共享性能测试

结论

在TPC-H 100 GB测试场景中,数据共享的目标端查询性能能够达到源端的90%以上。