7.0版TPC-H性能测试

更新时间:
复制为 MD 格式

本文介绍如何通过TPC-H基准测试对云原生数据仓库 AnalyticDB PostgreSQL 版7.0版进行性能测试,并提供测试结果供参考。

TPC-H简介

TPC-H是由TPC(Transaction Processing Performance Council)组织发布的决策支持基准测试,由一套面向业务的临时查询和并发数据修改组成。该基准测试用于评估决策支持系统在大数据量下执行复杂查询的能力。更多信息,请参见TPC Benchmark H(TPC-H)

测试说明

  • 本文以100 GB数据规模为例进行测试,规格选型为8segment节点(每节点2C16G)。

  • 如需测试1 TB数据规模,推荐使用48segment节点(每节点2C16G),通过更多的计算节点发挥MPP架构优势。

测试结果

基于TPC-H 100 GB数据集,运行22条标准测试SQL,在以下两种场景下进行测试:

  • 开启自研计算引擎Laser。

  • 采用Greenplum原生计算引擎。

Laser计算引擎对比原生计算引擎带来了显著的性能提升,总执行时间约提升12倍。

image.png

详细的测试数据如下表所示。

Query

AnalyticDB PostgreSQL 7.0(秒)

Greenplum原生引擎(秒)

Q1

4.8

26.34

Q2

0.26

6.07

Q3

1.6

17.05

Q4

1.58

39.1

Q5

2.79

24.49

Q6

0.19

1.56

Q7

1.31

20.66

Q8

2.27

23.16

Q9

5.6

63.56

Q10

2.15

13.92

Q11

0.41

3.82

Q12

1.13

13.56

Q13

3.52

21.55

Q14

0.31

1.7

Q15

0.59

3.86

Q16

0.95

4.25

Q17

0.86

107.32

Q18

9.83

85.78

Q19

1.61

16.08

Q20

0.61

25.75

Q21

5.04

49.78

Q22

1.06

14.49

合计

48.47

583.85

测试步骤

步骤一:创建实例

创建用于测试的AnalyticDB PostgreSQL实例。具体操作,请参见创建实例

本文中100 GB测试所使用的实例规格如下:

配置项

配置值

实例资源类型

存储弹性模式

实例系列

高可用

引擎版本

7.0标准版

节点规格(Segment)

2C16G

节点数量

8

存储磁盘类型

ESSD云盘PL1

节点存储容量

200 GB

数据库内核版本

v7.4.2.1及以上

说明

如测试100 GB,推荐8个节点(每节点2C16G);如测试1 TB,推荐48个节点(每节点2C16G)。

步骤二:生成测试数据

  1. 下载并解压TPC-H官方提供的数据生成工具dbqgen

  2. 修改dbgen_data.sh中的参数,设置需要生成的数据量和文件块数量。

    重要

    -s参数为数据量,以GB为单位;-C参数为文件块的数量,强烈建议设置为集群Segment(计算节点)数量的整数倍。例如100 GB测试推荐设置为32,1 TB测试推荐设置为96。

    ./dbgen -s 100 -S $i -C 32 -f &
  3. 运行脚本生成测试数据。

    sh dbgen_data.sh
  4. 将生成的数据上传至OSS对象存储。

    如果您是第一次使用OSS,请参见命令行工具ossutil 1.0完成ossutil命令行工具的安装与配置,并开通OSS存储空间。然后执行以下命令将数据上传至OSS:

    ossutil64 -e <EndPoint> -i <AccessKeyID> -k <AccessKeySecret> cp $data_dir oss://<oss bucket>/<dir>/

步骤三:创建表并导入数据

  1. 创建fixnumeric Extension。

    CREATE EXTENSION fixnumeric WITH SCHEMA pg_catalog;
  2. 创建TPC-H测试表。

    DROP TABLE IF EXISTS NATION;
    CREATE TABLE NATION (
        N_NATIONKEY  INTEGER NOT NULL,
        N_NAME       CHAR(25) NOT NULL,
        N_REGIONKEY  INTEGER NOT NULL,
        N_COMMENT    VARCHAR(152)
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED REPLICATED
    ;
    
    DROP TABLE IF EXISTS REGION;
    CREATE TABLE REGION (
        R_REGIONKEY  INTEGER NOT NULL,
        R_NAME       CHAR(25) NOT NULL,
        R_COMMENT    VARCHAR(152)
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED REPLICATED
    ;
    
    DROP TABLE IF EXISTS PART;
    CREATE TABLE PART (
        P_PARTKEY     INTEGER NOT NULL,
        P_NAME        VARCHAR(55) NOT NULL,
        P_MFGR        CHAR(25) NOT NULL,
        P_BRAND       CHAR(10) NOT NULL,
        P_TYPE        VARCHAR(25) NOT NULL,
        P_SIZE        INTEGER NOT NULL,
        P_CONTAINER   CHAR(10) NOT NULL,
        P_RETAILPRICE FIXNUMERIC(19) NOT NULL,
        P_COMMENT     VARCHAR(23) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (P_PARTKEY)
    ;
    
    DROP TABLE IF EXISTS SUPPLIER;
    CREATE TABLE SUPPLIER (
        S_SUPPKEY     INTEGER NOT NULL,
        S_NAME        CHAR(25) NOT NULL,
        S_ADDRESS     VARCHAR(40) NOT NULL,
        S_NATIONKEY   INTEGER NOT NULL,
        S_PHONE       CHAR(15) NOT NULL,
        S_ACCTBAL     FIXNUMERIC(19) NOT NULL,
        S_COMMENT     VARCHAR(101) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (S_SUPPKEY)
    ;
    
    DROP TABLE IF EXISTS PARTSUPP;
    CREATE TABLE PARTSUPP (
        PS_PARTKEY     INTEGER NOT NULL,
        PS_SUPPKEY     INTEGER NOT NULL,
        PS_AVAILQTY    INTEGER NOT NULL,
        PS_SUPPLYCOST  FIXNUMERIC(19)  NOT NULL,
        PS_COMMENT     VARCHAR(199) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (PS_PARTKEY)
    ;
    
    DROP TABLE IF EXISTS CUSTOMER;
    CREATE TABLE CUSTOMER (
        C_CUSTKEY     INTEGER NOT NULL,
        C_NAME        VARCHAR(25) NOT NULL,
        C_ADDRESS     VARCHAR(40) NOT NULL,
        C_NATIONKEY   INTEGER NOT NULL,
        C_PHONE       VARCHAR(15) NOT NULL,
        C_ACCTBAL     FIXNUMERIC(19)  NOT NULL,
        C_MKTSEGMENT  CHAR(10) NOT NULL,
        C_COMMENT     VARCHAR(117) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (C_CUSTKEY)
    ;
    
    DROP TABLE IF EXISTS ORDERS;
    CREATE TABLE ORDERS (
        O_ORDERKEY       BIGINT NOT NULL,
        O_CUSTKEY        INTEGER NOT NULL,
        O_ORDERSTATUS    "CHAR" NOT NULL,
        O_TOTALPRICE     FIXNUMERIC(19) NOT NULL,
        O_ORDERDATE      DATE NOT NULL,
        O_ORDERPRIORITY  CHAR(15) NOT NULL,
        O_CLERK          CHAR(15) NOT NULL,
        O_SHIPPRIORITY   INTEGER NOT NULL,
        O_COMMENT        VARCHAR(79) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (O_ORDERKEY)
    ORDER BY(O_ORDERDATE);
    
    CREATE TABLE LINEITEM(
        L_ORDERKEY       BIGINT NOT NULL,
        L_PARTKEY        INTEGER NOT NULL,
        L_SUPPKEY        INTEGER NOT NULL,
        L_LINENUMBER     INTEGER NOT NULL,
        L_QUANTITY       FIXNUMERIC(19) NOT NULL,
        L_EXTENDEDPRICE  FIXNUMERIC(19) NOT NULL,
        L_DISCOUNT       FIXNUMERIC(19) NOT NULL,
        L_TAX            FIXNUMERIC(19) NOT NULL,
        L_RETURNFLAG     "CHAR" NOT NULL,
        L_LINESTATUS     "CHAR" NOT NULL,
        L_SHIPDATE       DATE NOT NULL,
        L_COMMITDATE     DATE NOT NULL,
        L_RECEIPTDATE    DATE NOT NULL,
        L_SHIPINSTRUCT   CHAR(25) NOT NULL,
        L_SHIPMODE       CHAR(10) NOT NULL,
        L_COMMENT        VARCHAR(44) NOT NULL
    )
    USING beam
    WITH(compresstype=auto,compresslevel=1)
    DISTRIBUTED BY (L_ORDERKEY)
    ORDER BY(L_SHIPDATE);
  3. 创建OSS外表。

    -- 创建OSS Server
    CREATE SERVER oss_serv
    FOREIGN DATA WRAPPER oss_fdw 
    OPTIONS (
      endpoint '<EndPoint>',
      bucket '<Bucket>');
    
    -- 创建User Mapping
    CREATE USER MAPPING
    FOR PUBLIC SERVER oss_serv 
    OPTIONS (
      id '<AccessKeyID>',
      KEY '<AccessKeySecret>');
    
    -- 创建OSS外表
    CREATE FOREIGN TABLE ext_nation(
        N_NATIONKEY  INT, 
        N_NAME       VARCHAR(25), 
        N_REGIONKEY  INTEGER, 
        N_COMMENT    VARCHAR(152)
    ) 
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/nation.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_region(
        R_REGIONKEY INT, 
        R_NAME      CHAR(25),
        R_COMMENT   VARCHAR(152)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/region.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_lineitem ( 
        L_ORDERKEY      BIGINT, 
        L_PARTKEY       BIGINT, 
        L_SUPPKEY       BIGINT, 
        L_LINENUMBER    BIGINT, 
        L_QUANTITY      NUMERIC(15,2), 
        L_EXTENDEDPRICE NUMERIC(15,2), 
        L_DISCOUNT      NUMERIC(15,2), 
        L_TAX           NUMERIC(15,2), 
        L_RETURNFLAG    CHAR(1), 
        L_LINESTATUS    CHAR(1), 
        L_SHIPDATE      DATE, 
        L_COMMITDATE    DATE, 
        L_RECEIPTDATE   DATE, 
        L_SHIPINSTRUCT  CHAR(25), 
        L_SHIPMODE      CHAR(10), 
        L_COMMENT       VARCHAR(44)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/lineitem.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_orders ( 
        O_ORDERKEY      BIGINT, 
        O_CUSTKEY       BIGINT, 
        O_ORDERSTATUS   CHAR(1) , 
        O_TOTALPRICE    NUMERIC(15,2), 
        O_ORDERDATE     DATE, 
        O_ORDERPRIORITY CHAR(15), 
        O_CLERK         CHAR(15), 
        O_SHIPPRIORITY  BIGINT, 
        O_COMMENT       VARCHAR(79)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/orders.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_part ( 
        P_PARTKEY     BIGINT, 
        P_NAME        VARCHAR(55), 
        P_MFGR        CHAR(25), 
        P_BRAND       CHAR(10), 
        P_TYPE        VARCHAR(25), 
        P_SIZE        BIGINT, 
        P_CONTAINER   CHAR(10), 
        P_RETAILPRICE NUMERIC(15,2), 
        P_COMMENT     VARCHAR(23)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/part.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_partsupp ( 
        PS_PARTKEY    BIGINT, 
        PS_SUPPKEY    BIGINT, 
        PS_AVAILQTY   BIGINT, 
        PS_SUPPLYCOST NUMERIC(15,2), 
        PS_COMMENT    VARCHAR(199)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/partsupp.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_supplier (
        S_SUPPKEY   BIGINT, 
        S_NAME      CHAR(25), 
        S_ADDRESS   VARCHAR(40), 
        S_NATIONKEY BIGINT, 
        S_PHONE     CHAR(15), 
        S_ACCTBAL   DECIMAL(15,2), 
        S_COMMENT   VARCHAR(101)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/supplier.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_customer ( 
        C_CUSTKEY    BIGINT, 
        C_NAME       VARCHAR(25), 
        C_ADDRESS    VARCHAR(40), 
        C_NATIONKEY  BIGINT, 
        C_PHONE      VARCHAR(15), 
        C_ACCTBAL    NUMERIC(15,2), 
        C_MKTSEGMENT CHAR(10), 
        C_COMMENT    VARCHAR(117)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/customer.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
  4. 通过OSS外表导入数据至本地表。

    INSERT INTO nation SELECT * FROM ext_nation;
    INSERT INTO region SELECT * FROM ext_region;
    INSERT INTO orders SELECT * FROM ext_orders;
    INSERT INTO customer SELECT * FROM ext_customer;
    INSERT INTO partsupp SELECT * FROM ext_partsupp;
    INSERT INTO part SELECT * FROM ext_part;
    INSERT INTO supplier SELECT * FROM ext_supplier;
    INSERT INTO lineitem SELECT * FROM ext_lineitem;
  5. 优化表并收集统计信息。

    optimize lineitem ;
    optimize orders   ;
    optimize customer ;
    optimize supplier ;
    optimize part     ;
    optimize partsupp ;
    optimize region   ;
    optimize nation   ;
    
    ANALYZE fullscan lineitem ;
    ANALYZE fullscan orders   ;
    ANALYZE fullscan customer ;
    ANALYZE fullscan supplier ;
    ANALYZE fullscan part     ;
    ANALYZE fullscan partsupp ;
    ANALYZE fullscan region   ;
    ANALYZE fullscan nation   ;
  6. 删除OSS外表。

    DROP FOREIGN TABLE ext_lineitem;
    DROP FOREIGN TABLE ext_orders;
    DROP FOREIGN TABLE ext_customer;
    DROP FOREIGN TABLE ext_part;
    DROP FOREIGN TABLE ext_partsupp;
    DROP FOREIGN TABLE ext_supplier;
    DROP FOREIGN TABLE ext_nation;
    DROP FOREIGN TABLE ext_region;

数据量参考

100 GB数据集各表数据量如下:

表名

行数

lineitem

600,037,902

orders

150,000,000

customer

15,000,000

supplier

1,000,000

part

20,000,000

partsupp

80,000,000

region

5

nation

25

1 TB数据集各表数据量如下:

表名

行数

lineitem

5,999,989,709

orders

1,500,000,000

customer

150,000,000

supplier

10,000,000

part

200,000,000

partsupp

800,000,000

region

5

nation

25

步骤四:执行开箱配置

执行以下SQL开启Laser向量化计算引擎并设置内存参数。请将your_db替换为实际的数据库名。

ALTER DATABASE your_db SET laser.enable_vectorized_engine TO on;
ALTER DATABASE your_db SET statement_mem = '16GB';

步骤五:运行测试

TPC-H基准测试包含22条标准SQL查询(附录:TPC-H测试SQL)。以下提供自动化测试脚本,需要使用psql

  1. 创建测试目录结构。

    benchmark/
    ├── benchmark.sh
    └── query/
        ├── q1.sql
        ├── q2.sql
        ├── ...
        └── q22.sql
    • 22条测试SQL(附录:TPC-H测试SQL)分别保存为q1.sql~q22.sql,存放在benchmark/query/目录下。

    • 将以下测试脚本保存为benchmark/benchmark.sh

      #!/bin/bash
      
      usage() {
        cat <<EOF
      Usage:
        $0 <database> <round> <host>
      
      Params:
        database  必填,数据库名
        round     必填,执行轮数(正整数)
        host      必填,数据库主机地址,如 127.0.0.1 或 hostname
      
      Examples:
        # 在本机执行 adbpg,执行 3 轮
        $0 tpcds_db 3 127.0.0.1
      
        # 在远程主机执行 1 轮
        $0 tpcds_db 1 10.0.0.5
      EOF
      }
      
      # 参数: $1=database, $2=round, $3=host
      DATABASE="$1"
      ROUND="${2:-1}"
      HOST="$3"
      
      # 参数检查
      if [ -z "$DATABASE" ] || [ -z "$ROUND" ] || [ -z "$HOST" ]; then
        usage
        exit 1
      fi
      
      echo "Database: $DATABASE"
      echo "Round: $ROUND"
      echo "Host: $HOST"
      
      HOME_PATH=$(pwd)
      QUERY_PATH="$HOME_PATH/query/"
      LOG_PATH="$HOME_PATH/log/"
      mkdir -p "$LOG_PATH"
      
      timestamp=$(date +%Y%m%d_%H%M%S)
      log_file="$HOME_PATH/tpcds_query_rt_${timestamp}.csv"
      echo "query,cost,total_cost" > "$log_file"
      
      total_cost=0
      for ((i=1; i<=99; i++)); do
          last_cost=0
          for ((r=1; r<=ROUND; r++)); do
              echo "begin run TPC-DS Q${i} round ${r}/$ROUND on adbpg (db: $DATABASE, host: $HOST), $(date)"
              begin_time=$(date +%s.%N)
      
              psql -h "$HOST" -d "$DATABASE" -f "$QUERY_PATH/q$i.sql" > "$LOG_PATH/log${i}_r${r}.out"
              rc=$?
      
              end_time=$(date +%s.%N)
              last_cost=$(awk "BEGIN {print $end_time - $begin_time}")
              if [ $rc -ne 0 ] ; then
                  printf "run TPC-DS Q%s round %s on adbpg fail, cost: %.2f, %s\n" "$i" "$r" "$last_cost" "$(date)"
              else
                  printf "run TPC-DS Q%s round %s on adbpg succ, cost: %.2f, %s\n" "$i" "$r" "$last_cost" "$(date)"
              fi
          done
          cost=$last_cost
          total_cost=$(awk "BEGIN {print $total_cost + $cost}")
          printf "TPC-DS Q%s final (round %d): cost: %.2f, totalCost: %.2f, %s\n" "$i" "$ROUND" "$cost" "$total_cost" "$(date)"
          printf "Q%s,%.2f,%.2f\n" "$i" "$cost" "$total_cost" >> "$log_file"
      done
  2. 进入benchmark目录,执行以下命令运行测试。

    sh benchmark.sh <database> 2 <host>
    • <database>:数据库名称。

    • 2:执行轮数,取最后一轮的结果。

    • <host>:数据库主机地址。

附录:TPC-H测试SQL

  • Q1

    select
        l_returnflag,
        l_linestatus,
        sum(l_quantity) as sum_qty,
        sum(l_extendedprice) as sum_base_price,
        sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
        sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
        avg(l_quantity) as avg_qty,
        avg(l_extendedprice) as avg_price,
        avg(l_discount) as avg_disc,
        count(*) as count_order
    from
        lineitem
    where
        l_shipdate <= date '1998-12-01' - interval '93 day'
    group by
        l_returnflag,
        l_linestatus
    order by
        l_returnflag,
        l_linestatus;
  • Q2

    select
        s_acctbal,
        s_name,
        n_name,
        p_partkey,
        p_mfgr,
        s_address,
        s_phone,
        s_comment
    from
        part,
        supplier,
        partsupp,
        nation,
        region
    where
        p_partkey = ps_partkey
        and s_suppkey = ps_suppkey
        and p_size = 23
        and p_type like '%STEEL'
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and r_name = 'EUROPE'
        and ps_supplycost = (
            select
                min(ps_supplycost)
            from
                partsupp,
                supplier,
                nation,
                region
            where
                p_partkey = ps_partkey
                and s_suppkey = ps_suppkey
                and s_nationkey = n_nationkey
                and n_regionkey = r_regionkey
                and r_name = 'EUROPE'
        )
    order by
        s_acctbal desc,
        n_name,
        s_name,
        p_partkey
    limit 100;
  • Q3

    select
        l_orderkey,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        o_orderdate,
        o_shippriority
    from
        customer,
        orders,
        lineitem
    where
        c_mktsegment = 'MACHINERY'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < date '1995-03-24'
        and l_shipdate > date '1995-03-24'
    group by
        l_orderkey,
        o_orderdate,
        o_shippriority
    order by
        revenue desc,
        o_orderdate
    limit 10;
  • Q4

    select
        o_orderpriority,
        count(*) as order_count
    from
        orders
    where
        o_orderdate >= date '1996-08-01'
        and o_orderdate < date '1996-08-01' + interval '3' month
        and exists (
            select
                *
            from
                lineitem
            where
                l_orderkey = o_orderkey
                and l_commitdate < l_receiptdate
        )
    group by
        o_orderpriority
    order by
        o_orderpriority;
  • Q5

    select
        n_name,
        sum(l_extendedprice * (1 - l_discount)) as revenue
    from
        customer,
        orders,
        lineitem,
        supplier,
        nation,
        region
    where
        c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and l_suppkey = s_suppkey
        and c_nationkey = s_nationkey
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and r_name = 'MIDDLE EAST'
        and o_orderdate >= date '1994-01-01'
        and o_orderdate < date '1994-01-01' + interval '1' year
    group by
        n_name
    order by
        revenue desc;
  • Q6

    select
        sum(l_extendedprice * l_discount) as revenue
    from
        lineitem
    where
        l_shipdate >= date '1994-01-01'
        and l_shipdate < date '1994-01-01' + interval '1' year
        and l_discount between 0.06 - 0.01 and 0.06 + 0.01
        and l_quantity < 24;
  • Q7

    select
        supp_nation,
        cust_nation,
        l_year,
        sum(volume) as revenue
    from
        (
            select
                n1.n_name as supp_nation,
                n2.n_name as cust_nation,
                extract(year from l_shipdate) as l_year,
                l_extendedprice * (1 - l_discount) as volume
            from
                supplier,
                lineitem,
                orders,
                customer,
                nation n1,
                nation n2
            where
                s_suppkey = l_suppkey
                and o_orderkey = l_orderkey
                and c_custkey = o_custkey
                and s_nationkey = n1.n_nationkey
                and c_nationkey = n2.n_nationkey
                and (
                    (n1.n_name = 'JORDAN' and n2.n_name = 'INDONESIA')
                    or (n1.n_name = 'INDONESIA' and n2.n_name = 'JORDAN')
                )
                and l_shipdate between date '1995-01-01' and date '1996-12-31'
        ) as shipping
    group by
        supp_nation,
        cust_nation,
        l_year
    order by
        supp_nation,
        cust_nation,
        l_year;
  • Q8

    select
        o_year,
        sum(case
            when nation = 'INDONESIA' then volume
            else 0
        end) / sum(volume) as mkt_share
    from
        (
            select
                extract(year from o_orderdate) as o_year,
                l_extendedprice * (1 - l_discount) as volume,
                n2.n_name as nation
            from
                part,
                supplier,
                lineitem,
                orders,
                customer,
                nation n1,
                nation n2,
                region
            where
                p_partkey = l_partkey
                and s_suppkey = l_suppkey
                and l_orderkey = o_orderkey
                and o_custkey = c_custkey
                and c_nationkey = n1.n_nationkey
                and n1.n_regionkey = r_regionkey
                and r_name = 'ASIA'
                and s_nationkey = n2.n_nationkey
                and o_orderdate between date '1995-01-01' and date '1996-12-31'
                and p_type = 'STANDARD BRUSHED BRASS'
        ) as all_nations
    group by
        o_year
    order by
        o_year;
  • Q9

    select
        nation,
        o_year,
        sum(amount) as sum_profit
    from
        (
            select
                n_name as nation,
                extract(year from o_orderdate) as o_year,
                l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
            from
                part,
                supplier,
                lineitem,
                partsupp,
                orders,
                nation
            where
                s_suppkey = l_suppkey
                and ps_suppkey = l_suppkey
                and ps_partkey = l_partkey
                and p_partkey = l_partkey
                and o_orderkey = l_orderkey
                and s_nationkey = n_nationkey
                and p_name like '%chartreuse%'
        ) as profit
    group by
        nation,
        o_year
    order by
        nation,
        o_year desc;
  • Q10

    select
        c_custkey,
        c_name,
        sum(l_extendedprice * (1 - l_discount)) as revenue,
        c_acctbal,
        n_name,
        c_address,
        c_phone,
        c_comment
    from
        customer,
        orders,
        lineitem,
        nation
    where
        c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate >= date '1994-08-01'
        and o_orderdate < date '1994-08-01' + interval '3' month
        and l_returnflag = 'R'
        and c_nationkey = n_nationkey
    group by
        c_custkey,
        c_name,
        c_acctbal,
        c_phone,
        n_name,
        c_address,
        c_comment
    order by
        revenue desc
    limit 20;
  • Q11

    select
        ps_partkey,
        sum(ps_supplycost * ps_availqty) as value
    from
        partsupp,
        supplier,
        nation
    where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'INDONESIA'
    group by
        ps_partkey having
            sum(ps_supplycost * ps_availqty) > (
                select
                    sum(ps_supplycost * ps_availqty) * 0.0001000000
                from
                    partsupp,
                    supplier,
                    nation
                where
                    ps_suppkey = s_suppkey
                    and s_nationkey = n_nationkey
                    and n_name = 'INDONESIA'
            )
    order by
        value desc;
  • Q12

    select
        l_shipmode,
        sum(case
            when o_orderpriority = '1-URGENT'
                or o_orderpriority = '2-HIGH'
                then 1
            else 0
        end) as high_line_count,
        sum(case
            when o_orderpriority <> '1-URGENT'
                and o_orderpriority <> '2-HIGH'
                then 1
            else 0
        end) as low_line_count
    from
        orders,
        lineitem
    where
        o_orderkey = l_orderkey
        and l_shipmode in ('REG AIR', 'TRUCK')
        and l_commitdate < l_receiptdate
        and l_shipdate < l_commitdate
        and l_receiptdate >= date '1994-01-01'
        and l_receiptdate < date '1994-01-01' + interval '1' year
    group by
        l_shipmode
    order by
        l_shipmode;
  • Q13

    select
        c_count,
        count(*) as custdist
    from
        (
            select
                c_custkey,
                count(o_orderkey)
            from
                customer left outer join orders on
                    c_custkey = o_custkey
                    and o_comment not like '%pending%requests%'
            group by
                c_custkey
        ) as c_orders (c_custkey, c_count)
    group by
        c_count
    order by
        custdist desc,
        c_count desc;
  • Q14

    select
        100.00 * sum(case
            when p_type like 'PROMO%'
                then l_extendedprice * (1 - l_discount)
            else 0
        end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
    from
        lineitem,
        part
    where
        l_partkey = p_partkey
        and l_shipdate >= date '1994-11-01'
        and l_shipdate < date '1994-11-01' + interval '1' month;
  • Q15

    create view revenue0 (supplier_no, total_revenue) as
        select
            l_suppkey,
            sum(l_extendedprice * (1 - l_discount))
        from
            lineitem
        where
            l_shipdate >= date '1997-10-01'
            and l_shipdate < date '1997-10-01' + interval '3' month
        group by
            l_suppkey;
    select
        s_suppkey,
        s_name,
        s_address,
        s_phone,
        total_revenue
    from
        supplier,
        revenue0
    where
        s_suppkey = supplier_no
        and total_revenue = (
            select
                max(total_revenue)
            from
                revenue0
        )
    order by
        s_suppkey;
    drop view revenue0;
  • Q16

    select
        p_brand,
        p_type,
        p_size,
        count(distinct ps_suppkey) as supplier_cnt
    from
        partsupp,
        part
    where
        p_partkey = ps_partkey
        and p_brand <> 'Brand#44'
        and p_type not like 'SMALL BURNISHED%'
        and p_size in (36, 27, 34, 45, 11, 6, 25, 16)
        and ps_suppkey not in (
            select
                s_suppkey
            from
                supplier
            where
                s_comment like '%Customer%Complaints%'
        )
    group by
        p_brand,
        p_type,
        p_size
    order by
        supplier_cnt desc,
        p_brand,
        p_type,
        p_size;
  • Q17

    select
        sum(l_extendedprice) / 7.0 as avg_yearly
    from
        lineitem,
        part
    where
        p_partkey = l_partkey
        and p_brand = 'Brand#42'
        and p_container = 'JUMBO PACK'
        and l_quantity < (
            select
                0.2 * avg(l_quantity)
            from
                lineitem
            where
                l_partkey = p_partkey
        );
  • Q18

    select
        c_name,
        c_custkey,
        o_orderkey,
        o_orderdate,
        o_totalprice,
        sum(l_quantity)
    from
        customer,
        orders,
        lineitem
    where
        o_orderkey in (
            select
                l_orderkey
            from
                lineitem
            group by
                l_orderkey having
                    sum(l_quantity) > 312
        )
        and c_custkey = o_custkey
        and o_orderkey = l_orderkey
    group by
        c_name,
        c_custkey,
        o_orderkey,
        o_orderdate,
        o_totalprice
    order by
        o_totalprice desc,
        o_orderdate
    limit 100;
  • Q19

    select
        sum(l_extendedprice* (1 - l_discount)) as revenue
    from
        lineitem,
        part
    where
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#43'
            and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
            and l_quantity >= 5 and l_quantity <= 5 + 10
            and p_size between 1 and 5
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#45'
            and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
            and l_quantity >= 12 and l_quantity <= 12 + 10
            and p_size between 1 and 10
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        )
        or
        (
            p_partkey = l_partkey
            and p_brand = 'Brand#11'
            and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
            and l_quantity >= 24 and l_quantity <= 24 + 10
            and p_size between 1 and 15
            and l_shipmode in ('AIR', 'AIR REG')
            and l_shipinstruct = 'DELIVER IN PERSON'
        );
  • Q20

    select
        s_name,
        s_address
    from
        supplier,
        nation
    where
        s_suppkey in (
            select
                ps_suppkey
            from
                partsupp
            where
                ps_partkey in (
                    select
                        p_partkey
                    from
                        part
                    where
                        p_name like 'magenta%'
                )
                and ps_availqty > (
                    select
                        0.5 * sum(l_quantity)
                    from
                        lineitem
                    where
                        l_partkey = ps_partkey
                        and l_suppkey = ps_suppkey
                        and l_shipdate >= date '1996-01-01'
                        and l_shipdate < date '1996-01-01' + interval '1' year
                )
        )
        and s_nationkey = n_nationkey
        and n_name = 'RUSSIA'
    order by
        s_name;
  • Q21

    select
        s_name,
        count(*) as numwait
    from
        supplier,
        lineitem l1,
        orders,
        nation
    where
        s_suppkey = l1.l_suppkey
        and o_orderkey = l1.l_orderkey
        and o_orderstatus = 'F'
        and l1.l_receiptdate > l1.l_commitdate
        and exists (
            select
                *
            from
                lineitem l2
            where
                l2.l_orderkey = l1.l_orderkey
                and l2.l_suppkey <> l1.l_suppkey
        )
        and not exists (
            select
                *
            from
                lineitem l3
            where
                l3.l_orderkey = l1.l_orderkey
                and l3.l_suppkey <> l1.l_suppkey
                and l3.l_receiptdate > l3.l_commitdate
        )
        and s_nationkey = n_nationkey
        and n_name = 'MOZAMBIQUE'
    group by
        s_name
    order by
        numwait desc,
        s_name
    limit 100;
  • Q22

    select
        cntrycode,
        count(*) as numcust,
        sum(c_acctbal) as totacctbal
    from
        (
            select
                substring(c_phone from 1 for 2) as cntrycode,
                c_acctbal
            from
                customer
            where
                substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17')
                and c_acctbal > (
                    select
                        avg(c_acctbal)
                    from
                        customer
                    where
                        c_acctbal > 0.00
                        and substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17')
                )
                and not exists (
                    select
                        *
                    from
                        orders
                    where
                        o_custkey = c_custkey
                )
        ) as custsale
    group by
        cntrycode
    order by
        cntrycode;