文档

7.0版TPC-H性能测试

更新时间:

本文介绍如何通过TPC-H测试云原生数据仓库 AnalyticDB PostgreSQL 版7.0版的向量分析性能。

TPC-H简介

以下文字描述引用自TPC Benchmark™ H (TPC-H)规范:

“TPC-H是一个决策支持基准,由一套面向业务的临时查询和并发数据修改组成。选择的查询和填充数据库的数据具有广泛的行业相关性。该基准测试说明了决策支持系统可以检查大量数据,执行高度复杂的查询,并解答关键的业务问题。”

详情请参见TPCH Specification

说明

本文的TPC-H的实现基于TPC-H的基准测试,并不能与已发布的TPC-H基准测试结果相比较,本文中的测试并不符合TPC-H基准测试的所有要求。

性能测试结果

本次测试基于TPC-H 100GB的数据集,运行该Benchmark下的22条标准测试SQL。本文在开启自研计算引擎Laser和Greenplum原生计算引擎两种场景下进行了测试,两种场景的测试性能结合下图。通过测试结果可以看到,Laser计算引擎对比原生计算引擎带来了显著的性能提升,总执行时间仅原生计算引擎的五分之一左右。image.png

Query

AnalyticDB PostgreSQL 7.0

(单位为s)

Greenplum

(单位为s)

Q1

11.59

26.34

Q2

2.52

6.07

Q3

2.88

17.05

Q4

3.16

39.1

Q5

3.97

24.49

Q6

0.61

1.56

Q7

2.64

20.66

Q8

3.23

23.16

Q9

13.91

63.56

Q10

4.81

13.92

Q11

1.66

3.82

Q12

6.42

13.56

Q13

8.82

21.55

Q14

0.78

1.7

Q15

1.9

3.86

Q16

1.65

4.25

Q17

3.69

107.32

Q18

14.51

85.78

Q19

8.66

16.08

Q20

2.23

25.75

Q21

23.68

49.78

Q22

4.82

14.49

SUM

128.14

583.85

准备工作

  • 已注册阿里云账号。

  • 已创建用于测试的AnalyticDB PostgreSQL版实例。创建实例,请参见创建实例

    用于测试的AnalyticDB PostgreSQL版实例规格,请参见《AnalyticDB PostgreSQL版实例规格》章节。

  • 已创建用于测试的ECS实例。具体操作,请参见创建实例

    用于测试的ECS实例规格,请参见《ECS实例规格》章节

  • 已开通OSS并创建存储空间

  • 已将ECS实例的IP地址添加到云原生数据仓库 AnalyticDB PostgreSQL 版实例的白名单中,操作方式,请参见设置白名单

  • 已在ECS上安装psql,安装方式,请参见客户端连接

测试环境

AnalyticDB PostgreSQL版实例规格

本文用于测试的AnalyticDB PostgreSQL版实例规格如下:

  • 实例资源类型:存储弹性模式

  • 实例系列:高性能(基础版)

  • 引擎版本:7.0标准版

  • 节点规格(segment):2C8G

  • 节点数量(segment):8

  • 磁盘存储类型:ESSD云盘 PL1

  • 节点存储容量(segment):200 GB

  • 数据库内核小版本:V7.0.2.0及以上

ECS实例规格

本文用于测试的ECS实例规格如下:

  • 实例规格:ecs.g6e.4xlarge

  • 操作系统:CentOS 7.x

  • 系统盘:磁盘类型为ESSD云盘、容量为40 GiB、性能级别为PL1。

  • 数据盘:磁盘类型为ESSD云盘、容量为200 GiB、性能级别为PL3。

    说明

    ECS实例创建完成后需要挂载数据库,具体操作,请参见分区格式化数据盘(Linux)

生成测试数据

以下内容将指导您生成100 GB测试数据,并将测试数据上传至OSS。

  1. 登录ECS实例,登录方式,请参见连接ECS实例

  2. 在ECS上下载并解压dbqgen.tar

    本文中数据盘的路径为/mnt

    # 下载dbqgen文件。
    wget https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20230821/bclo/dbqgen.tar
    
    # 解压dbqgen文件
    tar -xvf dbqgen.tar
  3. 打开dbqgen文件,修改dbgen_data.sh文件。

    1. 进入dbqgen目录。

      cd dbqgen
    2. 编辑dbgen_data.sh文件。

      vim dbgen_data.sh
    3. 按i键进行编辑,将原文件中的./dbgen -s 450 -S $i -C 192 -f &替换为./dbgen -s 100 -S $i -C 32 -f &

      说明

      其中-s参数为数据量,单位为GB;-C为文件块的数量,建议设置为Segment节点数量的整数倍。

    4. 按Esc键退出编辑,输入:wq进行保存。

  4. 生成测试数据,生成的测试数据会在dbqgen目录下。

    sh dbgen_data.sh
  5. 将测试数据上传至OSS。

    如果您是首次使用OSS,建议您查看OSS文档了解ossutil工具的安装与使用方法,具体信息,请参见ossutil概述

    上传数据至OSS的命令如下:

    ./ossutil64 cp -r <测试数据所在的路径> <OSS Bucket路径> --exclude "*.tbl*"

    示例如下:

    ./ossutil64 cp -r /mnt/dbqgen/ oss://adbpg --exclude "*.tbl*"

创建表并导入数据

以下内容将指导您在AnalyticDB PostgreSQL版中创建测试表并将OSS上的测试数据导入测试表中。

  1. 使用psql工具连接AnalyticDB PostgreSQL版实例。安装工具和连接数据库的方法,请参见客户端连接

  2. 安装Fastdecimal插件并启用Laser计算引擎。

    说明

    如需测试Greenplum原生计算引擎的性能,则无需开启向量化引擎Laser。

    --创建Fastdecimal Extension。
    CREATE EXTENSION fastdecimal;
    
    --开启向量化引擎Laser。
    ALTER DATABASE <数据库名称> SET laser.enable = on;
    ALTER DATABASE <数据库名称> SET adbpg_enable_dynamic_join_filter = on;
    
    --调整查询可用内存。
    ALTER DATABASE <数据库名称> SET statement_mem = '16GB';
  3. 创建TPC-H测试表。

    DROP TABLE IF EXISTS NATION;
    CREATE TABLE NATION (
        N_NATIONKEY  INTEGER NOT NULL,
        N_NAME       CHAR(25) NOT NULL,
        N_REGIONKEY  INTEGER NOT NULL,
        N_COMMENT    VARCHAR(152)
    )
    using beam
    DISTRIBUTED Replicated
    ;
    
    DROP TABLE IF EXISTS REGION;
    CREATE TABLE REGION (
        R_REGIONKEY  INTEGER NOT NULL,
        R_NAME       CHAR(25) NOT NULL,
        R_COMMENT    VARCHAR(152)
    )
    using beam
    DISTRIBUTED Replicated
    ;
    
    DROP TABLE IF EXISTS PART;
    CREATE TABLE PART (
        P_PARTKEY     INTEGER NOT NULL,
        P_NAME        VARCHAR(55) NOT NULL,
        P_MFGR        CHAR(25) NOT NULL,
        P_BRAND       CHAR(10) NOT NULL,
        P_TYPE        VARCHAR(25) NOT NULL,
        P_SIZE        INTEGER NOT NULL,
        P_CONTAINER   CHAR(10) NOT NULL,
        P_RETAILPRICE decimal64 NOT NULL,
        P_COMMENT     VARCHAR(23) NOT NULL
    )
    using beam
    DISTRIBUTED BY (P_PARTKEY)
    ;
    
    DROP TABLE IF EXISTS SUPPLIER;
    CREATE TABLE SUPPLIER (
        S_SUPPKEY     INTEGER NOT NULL,
        S_NAME        CHAR(25) NOT NULL,
        S_ADDRESS     VARCHAR(40) NOT NULL,
        S_NATIONKEY   INTEGER NOT NULL,
        S_PHONE       CHAR(15) NOT NULL,
        S_ACCTBAL     decimal64 NOT NULL,
        S_COMMENT     VARCHAR(101) NOT NULL
    )
    using beam
    DISTRIBUTED BY (S_SUPPKEY)
    ;
    
    DROP TABLE IF EXISTS PARTSUPP;
    CREATE TABLE PARTSUPP (
        PS_PARTKEY     INTEGER NOT NULL,
        PS_SUPPKEY     INTEGER NOT NULL,
        PS_AVAILQTY    INTEGER NOT NULL,
        PS_SUPPLYCOST  decimal64  NOT NULL,
        PS_COMMENT     VARCHAR(199) NOT NULL
    )
    using beam
    DISTRIBUTED BY (PS_PARTKEY)
    ;
    
    DROP TABLE IF EXISTS CUSTOMER;
    CREATE TABLE CUSTOMER (
        C_CUSTKEY     INTEGER NOT NULL,
        C_NAME        VARCHAR(25) NOT NULL,
        C_ADDRESS     VARCHAR(40) NOT NULL,
        C_NATIONKEY   INTEGER NOT NULL,
        C_PHONE       CHAR(15) NOT NULL,
        C_ACCTBAL     decimal64   NOT NULL,
        C_MKTSEGMENT  CHAR(10) NOT NULL,
        C_COMMENT     VARCHAR(117) NOT NULL
    )
    using beam
    DISTRIBUTED BY (C_CUSTKEY)
    ;
    
    DROP TABLE IF EXISTS ORDERS;
    CREATE TABLE ORDERS (
        O_ORDERKEY       BIGINT NOT NULL,
        O_CUSTKEY        INTEGER NOT NULL,
        O_ORDERSTATUS    "char" NOT NULL,
        O_TOTALPRICE     decimal64 NOT NULL,
        O_ORDERDATE      DATE NOT NULL,
        O_ORDERPRIORITY  CHAR(15) NOT NULL,
        O_CLERK          CHAR(15) NOT NULL,
        O_SHIPPRIORITY   INTEGER NOT NULL,
        O_COMMENT        VARCHAR(79) NOT NULL
    )
    using beam
    DISTRIBUTED BY (O_ORDERKEY)
    ORDER BY(O_ORDERDATE)
    ;
    
    DROP TABLE IF EXISTS LINEITEM;
    CREATE TABLE LINEITEM (
        L_ORDERKEY    BIGINT NOT NULL,
        L_PARTKEY     INTEGER NOT NULL,
        L_SUPPKEY     INTEGER NOT NULL,
        L_LINENUMBER  INTEGER NOT NULL,
        L_QUANTITY    decimal64 NOT NULL,
        L_EXTENDEDPRICE  decimal64 NOT NULL,
        L_DISCOUNT    decimal64 NOT NULL,
        L_TAX         decimal64 NOT NULL,
        L_RETURNFLAG  "char" NOT NULL,
        L_LINESTATUS  "char" NOT NULL,
        L_SHIPDATE    DATE NOT NULL,
        L_COMMITDATE  DATE NOT NULL,
        L_RECEIPTDATE DATE NOT NULL,
        L_SHIPINSTRUCT CHAR(25) NOT NULL,
        L_SHIPMODE     CHAR(10) NOT NULL,
        L_COMMENT      VARCHAR(44) NOT NULL
    )
    using beam
    DISTRIBUTED BY (L_ORDERKEY)
    ORDER BY(L_SHIPDATE)
    ;
  4. 创建OSS Foreign Table。OSS Foreign Table用于导入OSS上的测试数据,更多关于OSS Foreign Table的使用方法,请参见OSS Foreign Table功能概览

    --创建OSS Server。
    CREATE SERVER oss_serv
    FOREIGN DATA WRAPPER oss_fdw 
    OPTIONS (
      endpoint '<EndPoint>',
      bucket 'Bucket');
    
    --创建User Mapping。
    CREATE USER MAPPING
    FOR PUBLIC SERVER oss_serv 
    OPTIONS (
      id '<AccessKeyID>',
      KEY 'AccessKeySecret');
    
    --创建OSS Foreign Table。
    CREATE FOREIGN TABLE ext_nation(
        N_NATIONKEY  INT, 
        N_NAME       VARCHAR(25), 
        N_REGIONKEY  INTEGER, 
        N_COMMENT    VARCHAR(152)
    ) 
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/nation.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_region(
        R_REGIONKEY INT, 
        R_NAME      CHAR(25),
        R_COMMENT   VARCHAR(152)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/region.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_lineitem ( 
        L_ORDERKEY      BIGINT, 
        L_PARTKEY       BIGINT, 
        L_SUPPKEY       BIGINT, 
        L_LINENUMBER    BIGINT, 
        L_QUANTITY      NUMERIC(15,2), 
        L_EXTENDEDPRICE NUMERIC(15,2), 
        L_DISCOUNT      NUMERIC(15,2), 
        L_TAX           NUMERIC(15,2), 
        L_RETURNFLAG    CHAR(1), 
        L_LINESTATUS    CHAR(1), 
        L_SHIPDATE      DATE, 
        L_COMMITDATE    DATE, 
        L_RECEIPTDATE   DATE, 
        L_SHIPINSTRUCT  CHAR(25), 
        L_SHIPMODE      CHAR(10), 
        L_COMMENT       VARCHAR(44)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/lineitem.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_orders ( 
        O_ORDERKEY      BIGINT, 
        O_CUSTKEY       BIGINT, 
        O_ORDERSTATUS   CHAR(1) , 
        O_TOTALPRICE    NUMERIC(15,2), 
        O_ORDERDATE     DATE, 
        O_ORDERPRIORITY CHAR(15), 
        O_CLERK         CHAR(15), 
        O_SHIPPRIORITY  BIGINT, 
        O_COMMENT       VARCHAR(79)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/orders.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_part ( 
        P_PARTKEY     BIGINT, 
        P_NAME        VARCHAR(55), 
        P_MFGR        CHAR(25), 
        P_BRAND       CHAR(10), 
        P_TYPE        VARCHAR(25), 
        P_SIZE        BIGINT, 
        P_CONTAINER   CHAR(10), 
        P_RETAILPRICE NUMERIC(15,2), 
        P_COMMENT     VARCHAR(23)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/part.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_partsupp ( 
        PS_PARTKEY    BIGINT, 
        PS_SUPPKEY    BIGINT, 
        PS_AVAILQTY   BIGINT, 
        PS_SUPPLYCOST NUMERIC(15,2), 
        PS_COMMENT    VARCHAR(199)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/partsupp.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
    
    CREATE FOREIGN TABLE ext_supplier (
        S_SUPPKEY   BIGINT, 
        S_NAME      CHAR(25), 
        S_ADDRESS   VARCHAR(40), 
        S_NATIONKEY BIGINT, 
        S_PHONE     CHAR(15), 
        S_ACCTBAL   DECIMAL(15,2), 
        S_COMMENT   VARCHAR(101)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/supplier.tbl',
        FORMAT 'text',
        DELIMITER '|'
        );
    
    CREATE FOREIGN TABLE ext_customer ( 
        C_CUSTKEY    BIGINT, 
        C_NAME       VARCHAR(25), 
        C_ADDRESS    VARCHAR(40), 
        C_NATIONKEY  BIGINT, 
        C_PHONE      CHAR(15), 
        C_ACCTBAL    NUMERIC(15,2), 
        C_MKTSEGMENT CHAR(10), 
        C_COMMENT    VARCHAR(117)
    )
    SERVER oss_serv OPTIONS (
        PREFIX '<Dir>/customer.tbl',
        FORMAT 'text',
        DELIMITER '|'
    );
  5. 将OSS上的数据导入至TPC-H测试表中。

    --从OSS Foreign Table导入数据至测试表。
    INSERT INTO nation SELECT * FROM ext_nation;
    INSERT INTO region SELECT * FROM ext_region;
    INSERT INTO orders SELECT * FROM ext_orders;
    INSERT INTO customer SELECT * FROM ext_customer;
    INSERT INTO partsupp SELECT * FROM ext_partsupp;
    INSERT INTO part SELECT * FROM ext_part;
    INSERT INTO supplier SELECT * FROM ext_supplier;
    INSERT INTO lineitem SELECT * FROM ext_lineitem;
    
    --删除OSS Foreign Table。
    DROP FOREIGN TABLE ext_lineitem;
    DROP FOREIGN TABLE ext_orders;
    DROP FOREIGN TABLE ext_customer;
    DROP FOREIGN TABLE ext_part;
    DROP FOREIGN TABLE ext_partsupp;
    DROP FOREIGN TABLE ext_supplier;
    DROP FOREIGN TABLE ext_nation;
    DROP FOREIGN TABLE ext_region;

统计信息收集

执行以下命令,收集统计信息。

ANALYZE (fullscan) lineitem ;
ANALYZE (fullscan) orders   ;
ANALYZE (fullscan) customer ;
ANALYZE (fullscan) supplier ;
ANALYZE (fullscan) part     ;
ANALYZE (fullscan) partsupp ;
ANALYZE (fullscan) region   ;
ANALYZE (fullscan) nation   ;

运行测试查询

-- Q1
SELECT
    l_returnflag,
    l_linestatus,
    sum(l_quantity) AS sum_qty,
    sum(l_extendedprice) AS sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    avg(l_quantity) AS avg_qty,
    avg(l_extendedprice) AS avg_price,
    avg(l_discount) AS avg_disc,
    COUNT(*) AS count_order
FROM
    lineitem
WHERE
    l_shipdate <= date '1998-12-01' - interval '93 day'
GROUP BY
    l_returnflag,
    l_linestatus
ORDER BY
    l_returnflag,
    l_linestatus;
    
-- Q2
SELECT
    s_acctbal,
    s_name,
    n_name,
    p_partkey,
    p_mfgr,
    s_address,
    s_phone,
    s_comment
FROM
    part,
    supplier,
    partsupp,
    nation,
    region
WHERE
    p_partkey = ps_partkey
    AND s_suppkey = ps_suppkey
    AND p_size = 23
    AND p_type like '%STEEL'
    AND s_nationkey = n_nationkey
    AND n_regionkey = r_regionkey
    AND r_name = 'EUROPE'
    AND ps_supplycost = (
        SELECT
            min(ps_supplycost)
        FROM
            partsupp,
            supplier,
            nation,
            region
        WHERE
            p_partkey = ps_partkey
            AND s_suppkey = ps_suppkey
            AND s_nationkey = n_nationkey
            AND n_regionkey = r_regionkey
            AND r_name = 'EUROPE'
    )
ORDER BY
    s_acctbal DESC,
    n_name,
    s_name,
    p_partkey
LIMIT 100;

-- Q3
SELECT
    l_orderkey,
    sum(l_extendedprice * (1 - l_discount)) as revenue,
    o_orderdate,
    o_shippriority
FROM
    customer,
    orders,
    lineitem
WHERE
    c_mktsegment = 'MACHINERY'
    AND c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderdate < date '1995-03-24'
    AND l_shipdate > date '1995-03-24'
GROUP BY
    l_orderkey,
    o_orderdate,
    o_shippriority
ORDER BY
    revenue DESC,
    o_orderdate
LIMIT 10;

-- Q4
SELECT
    o_orderpriority,
    COUNT(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= date '1996-08-01'
    AND o_orderdate < date '1996-08-01' + interval '3' month
    AND exists (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
    
-- Q5
SELECT
    n_name,
    sum(l_extendedprice * (1 - l_discount)) as revenue
FROM
    customer,
    orders,
    lineitem,
    supplier,
    nation,
    region
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND l_suppkey = s_suppkey
    AND c_nationkey = s_nationkey
    AND s_nationkey = n_nationkey
    AND n_regionkey = r_regionkey
    AND r_name = 'MIDDLE EAST'
    AND o_orderdate >= date '1994-01-01'
    AND o_orderdate < date '1994-01-01' + interval '1' year
GROUP BY
    n_name
ORDER BY
    revenue DESC;
    
-- Q6
SELECT
    sum(l_extendedprice * l_discount) as revenue
FROM
    lineitem
WHERE
    l_shipdate >= date '1994-01-01'
    AND l_shipdate < date '1994-01-01' + interval '1' year
    AND l_discount BETWEEN 0.06 - 0.01 AND 0.06 + 0.01
    AND l_quantity < 24;
    
-- Q7
SELECT
    supp_nation,
    cust_nation,
    l_year,
    sum(volume) AS revenue
FROM
    (
        SELECT
            n1.n_name AS supp_nation,
            n2.n_name AS cust_nation,
            extract(year FROM l_shipdate) AS l_year,
            l_extendedprice * (1 - l_discount) AS volume
        FROM
            supplier,
            lineitem,
            orders,
            customer,
            nation n1,
            nation n2
        WHERE
            s_suppkey = l_suppkey
            AND o_orderkey = l_orderkey
            AND c_custkey = o_custkey
            AND s_nationkey = n1.n_nationkey
            AND c_nationkey = n2.n_nationkey
            AND (
                (n1.n_name = 'JORDAN' AND n2.n_name = 'INDONESIA')
                OR (n1.n_name = 'INDONESIA' AND n2.n_name = 'JORDAN')
            )
            AND l_shipdate BETWEEN date '1995-01-01' AND date '1996-12-31'
    ) AS shipping
GROUP BY
    supp_nation,
    cust_nation,
    l_year
ORDER BY
    supp_nation,
    cust_nation,
    l_year;

-- Q8
SELECT
    o_year,
    sum(case
        when nation = 'INDONESIA' then volume
        else 0
    end) / sum(volume) as mkt_share
FROM
    (
        SELECT
            extract(year FROM o_orderdate) AS o_year,
            l_extendedprice * (1 - l_discount) AS volume,
            n2.n_name AS nation
        FROM
            part,
            supplier,
            lineitem,
            orders,
            customer,
            nation n1,
            nation n2,
            region
        WHERE
            p_partkey = l_partkey
            AND s_suppkey = l_suppkey
            AND l_orderkey = o_orderkey
            AND o_custkey = c_custkey
            AND c_nationkey = n1.n_nationkey
            AND n1.n_regionkey = r_regionkey
            AND r_name = 'ASIA'
            AND s_nationkey = n2.n_nationkey
            AND o_orderdate BETWEEN date '1995-01-01' AND date '1996-12-31'
            AND p_type = 'STANDARD BRUSHED BRASS'
    ) AS all_nations
GROUP BY
    o_year
ORDER BY
    o_year;
    
-- Q9
SELECT
    nation,
    o_year,
    sum(amount) AS sum_profit
FROM
    (
        SELECT
            n_name AS nation,
            extract(year FROM o_orderdate) AS o_year,
            l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity AS amount
        FROM
            part,
            supplier,
            lineitem,
            partsupp,
            orders,
            nation
        WHERE
            s_suppkey = l_suppkey
            AND ps_suppkey = l_suppkey
            AND ps_partkey = l_partkey
            AND p_partkey = l_partkey
            AND o_orderkey = l_orderkey
            AND s_nationkey = n_nationkey
            AND p_name LIKE '%chartreuse%'
    ) AS profit
GROUP BY
    nation,
    o_year
ORDER BY
    nation,
    o_year DESC;
    
-- Q10
SELECT
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) as revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
FROM
    customer,
    orders,
    lineitem,
    nation
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderdate >= date '1994-08-01'
    AND o_orderdate < date '1994-08-01' + interval '3' month
    AND l_returnflag = 'R'
    AND c_nationkey = n_nationkey
GROUP BY
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
ORDER BY
    revenue DESC
LIMIT 20;

-- Q11
SELECT
    ps_partkey,
    sum(ps_supplycost * ps_availqty) as value
FROM
    partsupp,
    supplier,
    nation
WHERE
    ps_suppkey = s_suppkey
    AND s_nationkey = n_nationkey
    AND n_name = 'INDONESIA'
GROUP BY
    ps_partkey having
        sum(ps_supplycost * ps_availqty) > (
            SELECT
                sum(ps_supplycost * ps_availqty) * 0.0001000000
            FROM
                partsupp,
                supplier,
                nation
            WHERE
                ps_suppkey = s_suppkey
                AND s_nationkey = n_nationkey
                AND n_name = 'INDONESIA'
        )
ORDER BY
    value DESC;

-- Q12
select
    l_shipmode,
    sum(case
        when o_orderpriority = '1-URGENT'
            or o_orderpriority = '2-HIGH'
            then 1
        else 0
    end) as high_line_count,
    sum(case
        when o_orderpriority <> '1-URGENT'
            and o_orderpriority <> '2-HIGH'
            then 1
        else 0
    end) as low_line_count
from
    orders,
    lineitem
where
    o_orderkey = l_orderkey
    and l_shipmode in ('REG AIR', 'TRUCK')
    and l_commitdate < l_receiptdate
    and l_shipdate < l_commitdate
    and l_receiptdate >= date '1994-01-01'
    and l_receiptdate < date '1994-01-01' + interval '1' year
group by
    l_shipmode
order by
    l_shipmode;
    
-- Q13
select
    c_count,
    count(*) as custdist
from
    (
        select
            c_custkey,
            count(o_orderkey)
        from
            customer left outer join orders on
                c_custkey = o_custkey
                and o_comment not like '%pending%requests%'
        group by
            c_custkey
    ) as c_orders (c_custkey, c_count)
group by
    c_count
order by
    custdist desc,
    c_count desc;
    
-- Q14
select
    100.00 * sum(case
        when p_type like 'PROMO%'
            then l_extendedprice * (1 - l_discount)
        else 0
    end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
    lineitem,
    part
where
    l_partkey = p_partkey
    and l_shipdate >= date '1994-11-01'
    and l_shipdate < date '1994-11-01' + interval '1' month;
    
-- Q15
create view revenue0 (supplier_no, total_revenue) as
    select
        l_suppkey,
        sum(l_extendedprice * (1 - l_discount))
    from
        lineitem
    where
        l_shipdate >= date '1997-10-01'
        and l_shipdate < date '1997-10-01' + interval '3' month
    group by
        l_suppkey;
select
    s_suppkey,
    s_name,
    s_address,
    s_phone,
    total_revenue
from
    supplier,
    revenue0
where
    s_suppkey = supplier_no
    and total_revenue = (
        select
            max(total_revenue)
        from
            revenue0
    )
order by
    s_suppkey;
drop view revenue0;

-- Q16
select
    p_brand,
    p_type,
    p_size,
    count(distinct ps_suppkey) as supplier_cnt
from
    partsupp,
    part
where
    p_partkey = ps_partkey
    and p_brand <> 'Brand#44'
    and p_type not like 'SMALL BURNISHED%'
    and p_size in (36, 27, 34, 45, 11, 6, 25, 16)
    and ps_suppkey not in (
        select
            s_suppkey
        from
            supplier
        where
            s_comment like '%Customer%Complaints%'
    )
group by
    p_brand,
    p_type,
    p_size
order by
    supplier_cnt desc,
    p_brand,
    p_type,
    p_size;

-- Q17
select
    sum(l_extendedprice) / 7.0 as avg_yearly
from
    lineitem,
    part
where
    p_partkey = l_partkey
    and p_brand = 'Brand#42'
    and p_container = 'JUMBO PACK'
    and l_quantity < (
        select
            0.2 * avg(l_quantity)
        from
            lineitem
        where
            l_partkey = p_partkey
    );
    
-- Q18
select
    c_name,
    c_custkey,
    o_orderkey,
    o_orderdate,
    o_totalprice,
    sum(l_quantity)
from
    customer,
    orders,
    lineitem
where
    o_orderkey in (
        select
            l_orderkey
        from
            lineitem
        group by
            l_orderkey having
                sum(l_quantity) > 312
    )
    and c_custkey = o_custkey
    and o_orderkey = l_orderkey
group by
    c_name,
    c_custkey,
    o_orderkey,
    o_orderdate,
    o_totalprice
order by
    o_totalprice desc,
    o_orderdate
limit 100;

-- Q19
select
    sum(l_extendedprice* (1 - l_discount)) as revenue
from
    lineitem,
    part
where
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#43'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
        and l_quantity >= 5 and l_quantity <= 5 + 10
        and p_size between 1 and 5
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    )
    or
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#45'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
        and l_quantity >= 12 and l_quantity <= 12 + 10
        and p_size between 1 and 10
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    )
    or
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#11'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
        and l_quantity >= 24 and l_quantity <= 24 + 10
        and p_size between 1 and 15
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    );

-- Q20
select
    s_name,
    s_address
from
    supplier,
    nation
where
    s_suppkey in (
        select
            ps_suppkey
        from
            partsupp
        where
            ps_partkey in (
                select
                    p_partkey
                from
                    part
                where
                    p_name like 'magenta%'
            )
            and ps_availqty > (
                select
                    0.5 * sum(l_quantity)
                from
                    lineitem
                where
                    l_partkey = ps_partkey
                    and l_suppkey = ps_suppkey
                    and l_shipdate >= date '1996-01-01'
                    and l_shipdate < date '1996-01-01' + interval '1' year
            )
    )
    and s_nationkey = n_nationkey
    and n_name = 'RUSSIA'
order by
    s_name;

-- Q21
select
    s_name,
    count(*) as numwait
from
    supplier,
    lineitem l1,
    orders,
    nation
where
    s_suppkey = l1.l_suppkey
    and o_orderkey = l1.l_orderkey
    and o_orderstatus = 'F'
    and l1.l_receiptdate > l1.l_commitdate
    and exists (
        select
            *
        from
            lineitem l2
        where
            l2.l_orderkey = l1.l_orderkey
            and l2.l_suppkey <> l1.l_suppkey
    )
    and not exists (
        select
            *
        from
            lineitem l3
        where
            l3.l_orderkey = l1.l_orderkey
            and l3.l_suppkey <> l1.l_suppkey
            and l3.l_receiptdate > l3.l_commitdate
    )
    and s_nationkey = n_nationkey
    and n_name = 'MOZAMBIQUE'
group by
    s_name
order by
    numwait desc,
    s_name
limit 100;

-- Q22
select
        cntrycode,
        count(*) as numcust,
        sum(c_acctbal) as totacctbal
from
        (
            select
                substring(c_phone from 1 for 2) as cntrycode,
                c_acctbal
            from
                customer
            where
                substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17')
                and c_acctbal > (
                    select
                        avg(c_acctbal)
                    from
                        customer
                    where
                        c_acctbal > 0.00
                        and substring(c_phone from 1 for 2) in ('13', '31', '23', '29', '30', '18', '17')
                )
                and not exists (
                    select
                        *
                    from
                        orders
                    where
                        o_custkey = c_custkey
                )
        ) as custsale
group by
        cntrycode
order by
        cntrycode;