TPC-DS Benchmark性能测试

更新时间:
复制为 MD 格式

本文介绍如何使用TPC-DS基准测试对云原生数据仓库 AnalyticDB PostgreSQL 版进行性能测试,并提供测试结果供参考。

TPC-DS简介

TPC-DS是由TPC(Transaction Processing Performance Council)组织发布的决策支持系统基准测试,对决策支持系统的查询和数据维护等方面进行建模。该基准测试用于评估通用决策支持系统在受控、复杂、多用户工作负载下的查询响应时间、查询吞吐量和数据维护性能。更多信息,请参见TPC Benchmark DS(TPC-DS)

测试说明

  • 本文以100 GB数据规模为例进行测试,推荐使用8segment节点(每节点2C16G)。

  • 如需测试1 TB数据规模,推荐使用48Segment节点(每节点2C16G),通过更多的计算节点发挥MPP架构优势。

测试结果

基于TPC-DS 100 GB数据集,运行99条标准测试SQL,详细的测试数据如下表所示。

Query

执行时间(秒)

Q1

1.11

Q2

3.67

Q3

0.28

Q4

8.46

Q5

0.85

Q6

0.28

Q7

0.58

Q8

0.33

Q9

4.88

Q10

0.43

Q11

5.79

Q12

0.16

Q13

1.24

Q14

4.69

Q15

0.38

Q16

2.34

Q17

0.82

Q18

0.93

Q19

0.33

Q20

0.16

Q21

0.22

Q22

4.5

Q23

12.97

Q24

1.64

Q25

0.82

Q26

0.36

Q27

0.71

Q28

4.09

Q29

0.95

Q30

0.48

Q31

1.78

Q32

0.18

Q33

1.22

Q34

0.9

Q35

1.46

Q36

2.47

Q37

0.31

Q38

2.48

Q39

1.96

Q40

0.44

Q41

0.14

Q42

0.17

Q43

1.36

Q44

1.87

Q45

0.57

Q46

2.16

Q47

2.69

Q48

0.96

Q49

0.73

Q50

1.41

Q51

4.04

Q52

0.17

Q53

0.38

Q54

1.3

Q55

0.16

Q56

1.43

Q57

2.25

Q58

0.8

Q59

6.32

Q60

1.56

Q61

1

Q62

0.69

Q63

0.38

Q64

3.99

Q65

3.06

Q66

0.85

Q67

37.24

Q68

0.83

Q69

0.48

Q70

1.69

Q71

0.59

Q72

2.12

Q73

0.54

Q74

3.78

Q75

3.45

Q76

1.5

Q77

0.78

Q78

5.93

Q79

2

Q80

1.22

Q81

0.49

Q82

0.47

Q83

0.55

Q84

0.26

Q85

1.35

Q86

0.78

Q87

2.89

Q88

5.19

Q89

0.5

Q90

0.49

Q91

0.37

Q92

0.15

Q93

1.31

Q94

1.52

Q95

6.88

Q96

0.97

Q97

1.93

Q98

0.58

Q99

1.12

合计

208.04

前提条件

测试步骤

步骤一:创建实例

  1. 创建用于测试的AnalyticDB PostgreSQL实例。具体操作,请参见创建实例

    本文中100 GB测试所使用的实例规格如下:

    配置项

    配置值

    实例资源类型

    存储弹性模式

    实例系列

    高可用

    引擎版本

    7.0标准版

    节点规格(segment)

    2C16G

    节点数量

    8

    存储磁盘类型

    ESSD云盘PL1

    节点存储容量

    200 GB

    数据库内核版本

    v7.4.2.1及以上

    说明

    如测试100 GB,推荐8个节点(每节点2C16G);如测试1 TB,推荐48个节点(每节点2C16G)。

  2. ECS实例的IP地址添加到AnalyticDB PostgreSQL实例的白名单中。

步骤二:生成测试数据

  1. 下载TPC官方提供的tpcds-kit,并完成工具编译。

    tar -xvf tpcds-kit.tar
    cd tpcds-kit/tools
    #编译需要用到yacclex,若没有需安装yaccflex
    # sudo yum install byacc flex
    make clean
    make OS=LINUX 
  2. 使用如下脚本并行生成TPC-DS 100 GB数据。

    重要
    • SCALE参数为生成的TPC-DS数据集大小,1对应1 GB,100100 GB。

    • PARALLEL参数为数据切分的文件数量,强烈建议设置为集segment(计算节点)数量的整数倍。例如100 GB测试推荐设置为16,1 TB测试推荐设置为96。

    for((i=1;i<=16;i++));
    do
    ./dsdgen -TERMINATE N -SCALE 100 -PARALLEL 16 -CHILD $i &
    done
    
    wait
  3. 将生成的数据上传至OSS。

    如果您是第一次使用OSS,请参见命令行工具ossutil 1.0完成ossutil命令行工具的安装与配置,然后执行以下命令将数据上传至OSS:

    ./ossutil64 cp -r <测试数据所在的路径> <OSS Bucket路径> --exclude "*.tbl*"

    示例如下:

    说明

    需要单独为customer表建立一个独立文件夹。

    ./ossutil64 cp -r /mnt/dbqgen/ oss://adbpg --exclude "*.tbl*"

步骤三:创建数据表

create table call_center
(
    cc_call_center_sk         integer               not null,
    cc_call_center_id         char(16)              not null,
    cc_rec_start_date         date                          ,
    cc_rec_end_date           date                          ,
    cc_closed_date_sk         integer                       ,
    cc_open_date_sk           integer                       ,
    cc_name                   varchar(50)                   ,
    cc_class                  varchar(50)                   ,
    cc_employees              integer                       ,
    cc_sq_ft                  integer                       ,
    cc_hours                  char(20)                      ,
    cc_manager                varchar(40)                   ,
    cc_mkt_id                 integer                       ,
    cc_mkt_class              char(50)                      ,
    cc_mkt_desc               varchar(100)                  ,
    cc_market_manager         varchar(40)                   ,
    cc_division               integer                       ,
    cc_division_name          varchar(50)                   ,
    cc_company                integer                       ,
    cc_company_name           char(50)                      ,
    cc_street_number          char(10)                      ,
    cc_street_name            varchar(60)                   ,
    cc_street_type            char(15)                      ,
    cc_suite_number           char(10)                      ,
    cc_city                   varchar(60)                   ,
    cc_county                 varchar(30)                   ,
    cc_state                  char(2)                       ,
    cc_zip                    char(10)                      ,
    cc_country                varchar(20)                   ,
    cc_gmt_offset             fixnumeric(19)                     ,
    cc_tax_percentage         fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed Replicated;

create table catalog_page
(
    cp_catalog_page_sk        integer               not null,
    cp_catalog_page_id        char(16)              not null,
    cp_start_date_sk          integer                       ,
    cp_end_date_sk            integer                       ,
    cp_department             varchar(50)                   ,
    cp_catalog_number         integer                       ,
    cp_catalog_page_number    integer                       ,
    cp_description            varchar(100)                  ,
    cp_type                   varchar(100)                  
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table customer
(
    c_customer_sk             integer               not null,
    c_customer_id             char(16)              not null,
    c_current_cdemo_sk        integer                       ,
    c_current_hdemo_sk        integer                       ,
    c_current_addr_sk         integer                       ,
    c_first_shipto_date_sk    integer                       ,
    c_first_sales_date_sk     integer                       ,
    c_salutation              char(10)                      ,
    c_first_name              char(20)                      ,
    c_last_name               char(30)                      ,
    c_preferred_cust_flag     char(1)                       ,
    c_birth_day               integer                       ,
    c_birth_month             integer                       ,
    c_birth_year              integer                       ,
    c_birth_country           varchar(20)                   ,
    c_login                   char(13)                      ,
    c_email_address           char(50)                      ,
    c_last_review_date        char(10)                      
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (c_customer_sk);

create table customer_address
(
    ca_address_sk             integer               not null,
    ca_address_id             char(16)              not null,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             fixnumeric(19)                     ,
    ca_location_type          char(20)                      
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (ca_address_sk);

create table customer_demographics
(
    cd_demo_sk                integer               not null,
    cd_gender                 char(1)                       ,
    cd_marital_status         char(1)                       ,
    cd_education_status       char(20)                      ,
    cd_purchase_estimate      integer                       ,
    cd_credit_rating          char(10)                      ,
    cd_dep_count              integer                       ,
    cd_dep_employed_count     integer                       ,
    cd_dep_college_count      integer                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (cd_demo_sk);

create table date_dim
(
    d_date_sk                 integer               not null,
    d_date_id                 char(16)              not null,
    d_date                    date                          ,
    d_month_seq               integer                       ,
    d_week_seq                integer                       ,
    d_quarter_seq             integer                       ,
    d_year                    integer                       ,
    d_dow                     integer                       ,
    d_moy                     integer                       ,
    d_dom                     integer                       ,
    d_qoy                     integer                       ,
    d_fy_year                 integer                       ,
    d_fy_quarter_seq          integer                       ,
    d_fy_week_seq             integer                       ,
    d_day_name                char(9)                       ,
    d_quarter_name            char(6)                       ,
    d_holiday                 char(1)                       ,
    d_weekend                 char(1)                       ,
    d_following_holiday       char(1)                       ,
    d_first_dom               integer                       ,
    d_last_dom                integer                       ,
    d_same_day_ly             integer                       ,
    d_same_day_lq             integer                       ,
    d_current_day             char(1)                       ,
    d_current_week            char(1)                       ,
    d_current_month           char(1)                       ,
    d_current_quarter         char(1)                       ,
    d_current_year            char(1)                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table household_demographics
(
    hd_demo_sk                integer               not null,
    hd_income_band_sk         integer                       ,
    hd_buy_potential          char(15)                      ,
    hd_dep_count              integer                       ,
    hd_vehicle_count          integer                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table income_band
(
    ib_income_band_sk         integer               not null,
    ib_lower_bound            integer                       ,
    ib_upper_bound            integer                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table inventory
(
    inv_date_sk               integer               not null,
    inv_item_sk               integer               not null,
    inv_warehouse_sk          integer               not null,
    inv_quantity_on_hand      integer                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (inv_item_sk)
partition by range (inv_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (100),
    default partition others
);

create table item
(
    i_item_sk                 integer               not null,
    i_item_id                 char(16)              not null,
    i_rec_start_date          date                          ,
    i_rec_end_date            date                          ,
    i_item_desc               varchar(200)                  ,
    i_current_price           fixnumeric(19)                     ,
    i_wholesale_cost          fixnumeric(19)                     ,
    i_brand_id                integer                       ,
    i_brand                   char(50)                      ,
    i_class_id                integer                       ,
    i_class                   char(50)                      ,
    i_category_id             integer                       ,
    i_category                char(50)                      ,
    i_manufact_id             integer                       ,
    i_manufact                char(50)                      ,
    i_size                    char(20)                      ,
    i_formulation             char(20)                      ,
    i_color                   char(20)                      ,
    i_units                   char(10)                      ,
    i_container               char(10)                      ,
    i_manager_id              integer                       ,
    i_product_name            char(50)                      
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table promotion
(
    p_promo_sk                integer               not null,
    p_promo_id                char(16)              not null,
    p_start_date_sk           integer                       ,
    p_end_date_sk             integer                       ,
    p_item_sk                 integer                       ,
    p_cost                    fixnumeric(19)                     ,
    p_response_target         integer                       ,
    p_promo_name              char(50)                      ,
    p_channel_dmail           char(1)                       ,
    p_channel_email           char(1)                       ,
    p_channel_catalog         char(1)                       ,
    p_channel_tv              char(1)                       ,
    p_channel_radio           char(1)                       ,
    p_channel_press           char(1)                       ,
    p_channel_event           char(1)                       ,
    p_channel_demo            char(1)                       ,
    p_channel_details         varchar(100)                  ,
    p_purpose                 char(15)                      ,
    p_discount_active         char(1)                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table reason
(
    r_reason_sk               integer               not null,
    r_reason_id               char(16)              not null,
    r_reason_desc             char(100)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table ship_mode
(
    sm_ship_mode_sk           integer               not null,
    sm_ship_mode_id           char(16)              not null,
    sm_type                   char(30)                      ,
    sm_code                   char(10)                      ,
    sm_carrier                char(20)                      ,
    sm_contract               char(20)                      
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table store
(
    s_store_sk                integer               not null,
    s_store_id                char(16)              not null,
    s_rec_start_date          date                          ,
    s_rec_end_date            date                          ,
    s_closed_date_sk          integer                       ,
    s_store_name              varchar(50)                   ,
    s_number_employees        integer                       ,
    s_floor_space             integer                       ,
    s_hours                   char(20)                      ,
    s_manager                 varchar(40)                   ,
    s_market_id               integer                       ,
    s_geography_class         varchar(100)                  ,
    s_market_desc             varchar(100)                  ,
    s_market_manager          varchar(40)                   ,
    s_division_id             integer                       ,
    s_division_name           varchar(50)                   ,
    s_company_id              integer                       ,
    s_company_name            varchar(50)                   ,
    s_street_number           varchar(10)                   ,
    s_street_name             varchar(60)                   ,
    s_street_type             char(15)                      ,
    s_suite_number            char(10)                      ,
    s_city                    varchar(60)                   ,
    s_county                  varchar(30)                   ,
    s_state                   char(2)                       ,
    s_zip                     char(10)                      ,
    s_country                 varchar(20)                   ,
    s_gmt_offset              fixnumeric(19)                     ,
    s_tax_precentage          fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table time_dim
(
    t_time_sk                 integer               not null,
    t_time_id                 char(16)              not null,
    t_time                    integer                       ,
    t_hour                    integer                       ,
    t_minute                  integer                       ,
    t_second                  integer                       ,
    t_am_pm                   char(2)                       ,
    t_shift                   char(20)                      ,
    t_sub_shift               char(20)                      ,
    t_meal_time               char(20)                      
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table warehouse
(
    w_warehouse_sk            integer               not null,
    w_warehouse_id            char(16)              not null,
    w_warehouse_name          varchar(20)                   ,
    w_warehouse_sq_ft         integer                       ,
    w_street_number           char(10)                      ,
    w_street_name             varchar(60)                   ,
    w_street_type             char(15)                      ,
    w_suite_number            char(10)                      ,
    w_city                    varchar(60)                   ,
    w_county                  varchar(30)                   ,
    w_state                   char(2)                       ,
    w_zip                     char(10)                      ,
    w_country                 varchar(20)                   ,
    w_gmt_offset              fixnumeric(19)                  
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table web_page
(
    wp_web_page_sk            integer               not null,
    wp_web_page_id            char(16)              not null,
    wp_rec_start_date         date                          ,
    wp_rec_end_date           date                          ,
    wp_creation_date_sk       integer                       ,
    wp_access_date_sk         integer                       ,
    wp_autogen_flag           char(1)                       ,
    wp_customer_sk            integer                       ,
    wp_url                    varchar(100)                  ,
    wp_type                   char(50)                      ,
    wp_char_count             integer                       ,
    wp_link_count             integer                       ,
    wp_image_count            integer                       ,
    wp_max_ad_count           integer                       
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table web_site
(
    web_site_sk               integer               not null,
    web_site_id               char(16)              not null,
    web_rec_start_date        date                          ,
    web_rec_end_date          date                          ,
    web_name                  varchar(50)                   ,
    web_open_date_sk          integer                       ,
    web_close_date_sk         integer                       ,
    web_class                 varchar(50)                   ,
    web_manager               varchar(40)                   ,
    web_mkt_id                integer                       ,
    web_mkt_class             varchar(50)                   ,
    web_mkt_desc              varchar(100)                  ,
    web_market_manager        varchar(40)                   ,
    web_company_id            integer                       ,
    web_company_name          char(50)                      ,
    web_street_number         char(10)                      ,
    web_street_name           varchar(60)                   ,
    web_street_type           char(15)                      ,
    web_suite_number          char(10)                      ,
    web_city                  varchar(60)                   ,
    web_county                varchar(30)                   ,
    web_state                 char(2)                       ,
    web_zip                   char(10)                      ,
    web_country               varchar(20)                   ,
    web_gmt_offset            fixnumeric(19)                     ,
    web_tax_percentage        fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed replicated;

create table catalog_returns
(
    cr_returned_date_sk       integer                       ,
    cr_returned_time_sk       integer                       ,
    cr_item_sk                integer               not null,
    cr_refunded_customer_sk   integer                       ,
    cr_refunded_cdemo_sk      integer                       ,
    cr_refunded_hdemo_sk      integer                       ,
    cr_refunded_addr_sk       integer                       ,
    cr_returning_customer_sk  integer                       ,
    cr_returning_cdemo_sk     integer                       ,
    cr_returning_hdemo_sk     integer                       ,
    cr_returning_addr_sk      integer                       ,
    cr_call_center_sk         integer                       ,
    cr_catalog_page_sk        integer                       ,
    cr_ship_mode_sk           integer                       ,
    cr_warehouse_sk           integer                       ,
    cr_reason_sk              integer                       ,
    cr_order_number           bigint                not null,
    cr_return_quantity        integer                       ,
    cr_return_amount          fixnumeric(19)                     ,
    cr_return_tax             fixnumeric(19)                     ,
    cr_return_amt_inc_tax     fixnumeric(19)                     ,
    cr_fee                    fixnumeric(19)                     ,
    cr_return_ship_cost       fixnumeric(19)                     ,
    cr_refunded_cash          fixnumeric(19)                     ,
    cr_reversed_charge        fixnumeric(19)                     ,
    cr_store_credit           fixnumeric(19)                     ,
    cr_net_loss               fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (cr_item_sk)
partition by range (cr_returned_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (8),
    default partition others
);

create table catalog_sales
(
    cs_sold_date_sk           integer                       ,
    cs_sold_time_sk           integer                       ,
    cs_ship_date_sk           integer                       ,
    cs_bill_customer_sk       integer                       ,
    cs_bill_cdemo_sk          integer                       ,
    cs_bill_hdemo_sk          integer                       ,
    cs_bill_addr_sk           integer                       ,
    cs_ship_customer_sk       integer                       ,
    cs_ship_cdemo_sk          integer                       ,
    cs_ship_hdemo_sk          integer                       ,
    cs_ship_addr_sk           integer                       ,
    cs_call_center_sk         integer                       ,
    cs_catalog_page_sk        integer                       ,
    cs_ship_mode_sk           integer                       ,
    cs_warehouse_sk           integer                       ,
    cs_item_sk                integer               not null,
    cs_promo_sk               integer                       ,
    cs_order_number           bigint                not null,
    cs_quantity               integer                       ,
    cs_wholesale_cost         fixnumeric(19)                     ,
    cs_list_price             fixnumeric(19)                     ,
    cs_sales_price            fixnumeric(19)                     ,
    cs_ext_discount_amt       fixnumeric(19)                     ,
    cs_ext_sales_price        fixnumeric(19)                     ,
    cs_ext_wholesale_cost     fixnumeric(19)                     ,
    cs_ext_list_price         fixnumeric(19)                     ,
    cs_ext_tax                fixnumeric(19)                     ,
    cs_coupon_amt             fixnumeric(19)                     ,
    cs_ext_ship_cost          fixnumeric(19)                     ,
    cs_net_paid               fixnumeric(19)                     ,
    cs_net_paid_inc_tax       fixnumeric(19)                     ,
    cs_net_paid_inc_ship      fixnumeric(19)                     ,
    cs_net_paid_inc_ship_tax  fixnumeric(19)                     ,
    cs_net_profit             fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (cs_item_sk)
partition by range (cs_sold_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (28),
    default partition others
);

create table web_returns
(
    wr_returned_date_sk       integer                       ,
    wr_returned_time_sk       integer                       ,
    wr_item_sk                integer               not null,
    wr_refunded_customer_sk   integer                       ,
    wr_refunded_cdemo_sk      integer                       ,
    wr_refunded_hdemo_sk      integer                       ,
    wr_refunded_addr_sk       integer                       ,
    wr_returning_customer_sk  integer                       ,
    wr_returning_cdemo_sk     integer                       ,
    wr_returning_hdemo_sk     integer                       ,
    wr_returning_addr_sk      integer                       ,
    wr_web_page_sk            integer                       ,
    wr_reason_sk              integer                       ,
    wr_order_number           bigint                not null,
    wr_return_quantity        integer                       ,
    wr_return_amt             fixnumeric(19)                     ,
    wr_return_tax             fixnumeric(19)                     ,
    wr_return_amt_inc_tax     fixnumeric(19)                     ,
    wr_fee                    fixnumeric(19)                     ,
    wr_return_ship_cost       fixnumeric(19)                     ,
    wr_refunded_cash          fixnumeric(19)                     ,
    wr_reversed_charge        fixnumeric(19)                     ,
    wr_account_credit         fixnumeric(19)                     ,
    wr_net_loss               fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (wr_item_sk)
partition by range (wr_returned_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (180),
    default partition others
);

create table web_sales
(
    ws_sold_date_sk           integer                       ,
    ws_sold_time_sk           integer                       ,
    ws_ship_date_sk           integer                       ,
    ws_item_sk                integer               not null,
    ws_bill_customer_sk       integer                       ,
    ws_bill_cdemo_sk          integer                       ,
    ws_bill_hdemo_sk          integer                       ,
    ws_bill_addr_sk           integer                       ,
    ws_ship_customer_sk       integer                       ,
    ws_ship_cdemo_sk          integer                       ,
    ws_ship_hdemo_sk          integer                       ,
    ws_ship_addr_sk           integer                       ,
    ws_web_page_sk            integer                       ,
    ws_web_site_sk            integer                       ,
    ws_ship_mode_sk           integer                       ,
    ws_warehouse_sk           integer                       ,
    ws_promo_sk               integer                       ,
    ws_order_number           bigint                not null,
    ws_quantity               integer                       ,
    ws_wholesale_cost         fixnumeric(19)                     ,
    ws_list_price             fixnumeric(19)                     ,
    ws_sales_price            fixnumeric(19)                     ,
    ws_ext_discount_amt       fixnumeric(19)                     ,
    ws_ext_sales_price        fixnumeric(19)                     ,
    ws_ext_wholesale_cost     fixnumeric(19)                     ,
    ws_ext_list_price         fixnumeric(19)                     ,
    ws_ext_tax                fixnumeric(19)                     ,
    ws_coupon_amt             fixnumeric(19)                     ,
    ws_ext_ship_cost          fixnumeric(19)                     ,
    ws_net_paid               fixnumeric(19)                     ,
    ws_net_paid_inc_tax       fixnumeric(19)                     ,
    ws_net_paid_inc_ship      fixnumeric(19)                     ,
    ws_net_paid_inc_ship_tax  fixnumeric(19)                     ,
    ws_net_profit             fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (ws_item_sk)
partition by range (ws_sold_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (40),
    default partition others
);

create table store_returns
(
    sr_returned_date_sk       integer                       ,
    sr_return_time_sk         integer                       ,
    sr_item_sk                integer               not null,
    sr_customer_sk            integer                       ,
    sr_cdemo_sk               integer                       ,
    sr_hdemo_sk               integer                       ,
    sr_addr_sk                integer                       ,
    sr_store_sk               integer                       ,
    sr_reason_sk              integer                       ,
    sr_ticket_number          bigint                not null,
    sr_return_quantity        integer                       ,
    sr_return_amt             fixnumeric(19)                     ,
    sr_return_tax             fixnumeric(19)                     ,
    sr_return_amt_inc_tax     fixnumeric(19)                     ,
    sr_fee                    fixnumeric(19)                     ,
    sr_return_ship_cost       fixnumeric(19)                     ,
    sr_refunded_cash          fixnumeric(19)                     ,
    sr_reversed_charge        fixnumeric(19)                     ,
    sr_store_credit           fixnumeric(19)                     ,
    sr_net_loss               fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (sr_item_sk)
partition by range (sr_returned_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (100),
    default partition others
);

create table store_sales
(
    ss_sold_date_sk           integer                       ,
    ss_sold_time_sk           integer                       ,
    ss_item_sk                integer               not null,
    ss_customer_sk            integer                       ,
    ss_cdemo_sk               integer                       ,
    ss_hdemo_sk               integer                       ,
    ss_addr_sk                integer                       ,
    ss_store_sk               integer                       ,
    ss_promo_sk               integer                       ,
    ss_ticket_number          bigint                not null,
    ss_quantity               integer                       ,
    ss_wholesale_cost         fixnumeric(19)                     ,
    ss_list_price             fixnumeric(19)                     ,
    ss_sales_price            fixnumeric(19)                     ,
    ss_ext_discount_amt       fixnumeric(19)                     ,
    ss_ext_sales_price        fixnumeric(19)                     ,
    ss_ext_wholesale_cost     fixnumeric(19)                     ,
    ss_ext_list_price         fixnumeric(19)                     ,
    ss_ext_tax                fixnumeric(19)                     ,
    ss_coupon_amt             fixnumeric(19)                     ,
    ss_net_paid               fixnumeric(19)                     ,
    ss_net_paid_inc_tax       fixnumeric(19)                     ,
    ss_net_profit             fixnumeric(19)                     
)
using beam
with(compresstype=auto,compresslevel=1)
distributed by (ss_item_sk)
partition by range (ss_sold_date_sk)
(
    start(2450815) INCLUSIVE end(2453005) INCLUSIVE every (10),
    default partition others
);

步骤四:创建OSS外表

OSS外表用于将OSS上的测试数据导入到AnalyticDB PostgreSQL实例中。更多关于OSS Foreign Table的使用方法,请参见使用OSS Foreign Table进行数据湖分析

CREATE SERVER oss_serv
    FOREIGN DATA WRAPPER oss_fdw
    OPTIONS (
        endpoint 'oss-cn-hang****.aliyuncs.com',
        bucket 'testBucketName'
  );
CREATE USER MAPPING FOR PUBLIC SERVER oss_serv OPTIONS (id 'LTAI****************', key 'yourAccessKeySecret');

CREATE FOREIGN TABLE ext_call_center(
    cc_call_center_sk         integer,
    cc_call_center_id         char(16),
    cc_rec_start_date         date,
    cc_rec_end_date           date,
    cc_closed_date_sk         integer,
    cc_open_date_sk           integer,
    cc_name                   varchar(50),
    cc_class                  varchar(50),
    cc_employees              integer,
    cc_sq_ft                  integer,
    cc_hours                  char(20),
    cc_manager                varchar(40),
    cc_mkt_id                 integer,
    cc_mkt_class              char(50),
    cc_mkt_desc               varchar(100),
    cc_market_manager         varchar(40),
    cc_division               integer,
    cc_division_name          varchar(50),
    cc_company                integer,
    cc_company_name           char(50),
    cc_street_number          char(10),
    cc_street_name            varchar(60),
    cc_street_type            char(15),
    cc_suite_number           char(10),
    cc_city                   varchar(60),
    cc_county                 varchar(30),
    cc_state                  char(2),
    cc_zip                    char(10),
    cc_country                varchar(20),
    cc_gmt_offset             fixnumeric,
    cc_tax_percentage         fixnumeric)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/call_center_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_catalog_page
(
    cp_catalog_page_sk        integer,
    cp_catalog_page_id        char(16),
    cp_start_date_sk          integer,
    cp_end_date_sk            integer,
    cp_department             varchar(50),
    cp_catalog_number         integer,
    cp_catalog_page_number    integer,
    cp_description            varchar(100),
    cp_type                   varchar(100)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/catalog_page_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_customer
(
    c_customer_sk             integer,
    c_customer_id             char(16),
    c_current_cdemo_sk        integer,
    c_current_hdemo_sk        integer,
    c_current_addr_sk         integer,
    c_first_shipto_date_sk    integer,
    c_first_sales_date_sk     integer,
    c_salutation              char(10),
    c_first_name              char(20),
    c_last_name               char(30),
    c_preferred_cust_flag     char(1),
    c_birth_day               integer,
    c_birth_month             integer,
    c_birth_year              integer,
    c_birth_country           varchar(20),
    c_login                   char(13),
    c_email_address           char(50),
    c_last_review_date        char(10)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/customer/customer_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_customer_address
(
    ca_address_sk             integer,
    ca_address_id             char(16),
    ca_street_number          char(10),
    ca_street_name            varchar(60),
    ca_street_type            char(15),
    ca_suite_number           char(10),
    ca_city                   varchar(60),
    ca_county                 varchar(30),
    ca_state                  char(2),
    ca_zip                    char(10),
    ca_country                varchar(20),
    ca_gmt_offset             fixnumeric,
    ca_location_type          char(20) 
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/customer_address_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_customer_demographics
(
    cd_demo_sk                integer,
    cd_gender                 char(1),
    cd_marital_status         char(1),
    cd_education_status       char(20),
    cd_purchase_estimate      integer,
    cd_credit_rating          char(10),
    cd_dep_count              integer,
    cd_dep_employed_count     integer,
    cd_dep_college_count      integer
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/customer_demographics_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_date_dim
(
    d_date_sk                 integer,
    d_date_id                 char(16),
    d_date                    date,
    d_month_seq               integer,
    d_week_seq                integer,
    d_quarter_seq             integer,
    d_year                    integer,
    d_dow                     integer,
    d_moy                     integer,
    d_dom                     integer,
    d_qoy                     integer,
    d_fy_year                 integer,
    d_fy_quarter_seq          integer,
    d_fy_week_seq             integer,
    d_day_name                char(9),
    d_quarter_name            char(6),
    d_holiday                 char(1),
    d_weekend                 char(1),
    d_following_holiday       char(1),
    d_first_dom               integer,
    d_last_dom                integer,
    d_same_day_ly             integer,
    d_same_day_lq             integer,
    d_current_day             char(1),
    d_current_week            char(1),
    d_current_month           char(1),
    d_current_quarter         char(1),
    d_current_year            char(1)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/date_dim_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_household_demographics
(
    hd_demo_sk                integer,
    hd_income_band_sk         integer,
    hd_buy_potential          char(15),
    hd_dep_count              integer,
    hd_vehicle_count          integer 
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/household_demographics_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_income_band
(
    ib_income_band_sk         integer,
    ib_lower_bound            integer,
    ib_upper_bound            integer
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/income_band_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_inventory
(
    inv_date_sk               integer,
    inv_item_sk               integer,
    inv_warehouse_sk          integer,
    inv_quantity_on_hand      integer
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/inventory_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_item
(
    i_item_sk                 integer,
    i_item_id                 char(16),
    i_rec_start_date          date,
    i_rec_end_date            date,
    i_item_desc               varchar(200),
    i_current_price           fixnumeric,
    i_wholesale_cost          fixnumeric,
    i_brand_id                integer,
    i_brand                   char(50),
    i_class_id                integer,
    i_class                   char(50),
    i_category_id             integer,
    i_category                char(50),
    i_manufact_id             integer,
    i_manufact                char(50),
    i_size                    char(20),
    i_formulation             char(20),
    i_color                   char(20),
    i_units                   char(10),
    i_container               char(10),
    i_manager_id              integer,
    i_product_name            char(50)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/item_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_promotion
(
    p_promo_sk                integer,
    p_promo_id                char(16),
    p_start_date_sk           integer,
    p_end_date_sk             integer,
    p_item_sk                 integer,
    p_cost                    fixnumeric,
    p_response_target         integer,
    p_promo_name              char(50),
    p_channel_dmail           char(1),
    p_channel_email           char(1),
    p_channel_catalog         char(1),
    p_channel_tv              char(1),
    p_channel_radio           char(1),
    p_channel_press           char(1),
    p_channel_event           char(1),
    p_channel_demo            char(1),
    p_channel_details         varchar(100),
    p_purpose                 char(15),
    p_discount_active         char(1) 
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/promotion_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_reason
(
    r_reason_sk               integer,
    r_reason_id               char(16),
    r_reason_desc             char(100)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/reason_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_ship_mode
(
    sm_ship_mode_sk           integer,
    sm_ship_mode_id           char(16),
    sm_type                   char(30),
    sm_code                   char(10),
    sm_carrier                char(20),
    sm_contract               char(20)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/ship_mode_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_store
(
    s_store_sk                integer,
    s_store_id                char(16),
    s_rec_start_date          date,
    s_rec_end_date            date,
    s_closed_date_sk          integer,
    s_store_name              varchar(50),
    s_number_employees        integer,
    s_floor_space             integer,
    s_hours                   char(20),
    s_manager                 varchar(40),
    s_market_id               integer,
    s_geography_class         varchar(100),
    s_market_desc             varchar(100),
    s_market_manager          varchar(40),
    s_division_id             integer,
    s_division_name           varchar(50),
    s_company_id              integer,
    s_company_name            varchar(50),
    s_street_number           varchar(10),
    s_street_name             varchar(60),
    s_street_type             char(15),
    s_suite_number            char(10),
    s_city                    varchar(60),
    s_county                  varchar(30),
    s_state                   char(2),
    s_zip                     char(10),
    s_country                 varchar(20),
    s_gmt_offset              fixnumeric,
    s_tax_precentage          fixnumeric  
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/store_1',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_time_dim
(
    t_time_sk                 integer,
    t_time_id                 char(16),
    t_time                    integer,
    t_hour                    integer,
    t_minute                  integer,
    t_second                  integer,
    t_am_pm                   char(2),
    t_shift                   char(20),
    t_sub_shift               char(20),
    t_meal_time               char(20)
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/time_dim_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_warehouse
(
    w_warehouse_sk            integer,
    w_warehouse_id            char(16),
    w_warehouse_name          varchar(20),
    w_warehouse_sq_ft         integer,
    w_street_number           char(10),
    w_street_name             varchar(60),
    w_street_type             char(15),
    w_suite_number            char(10),
    w_city                    varchar(60),
    w_county                  varchar(30),
    w_state                   char(2),
    w_zip                     char(10),
    w_country                 varchar(20),
    w_gmt_offset              fixnumeric
 )
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/warehouse_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_web_page
(
    wp_web_page_sk            integer,
    wp_web_page_id            char(16),
    wp_rec_start_date         date,
    wp_rec_end_date           date,
    wp_creation_date_sk       integer,
    wp_access_date_sk         integer,
    wp_autogen_flag           char(1),
    wp_customer_sk            integer,
    wp_url                    varchar(100),
    wp_type                   char(50),
    wp_char_count             integer,
    wp_link_count             integer,
    wp_image_count            integer,
    wp_max_ad_count           integer
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/web_page_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_web_site
(
    web_site_sk               integer,
    web_site_id               char(16),
    web_rec_start_date        date,
    web_rec_end_date          date,
    web_name                  varchar(50),
    web_open_date_sk          integer,
    web_close_date_sk         integer,
    web_class                 varchar(50),
    web_manager               varchar(40),
    web_mkt_id                integer,
    web_mkt_class             varchar(50) ,
    web_mkt_desc              varchar(100),
    web_market_manager        varchar(40),
    web_company_id            integer,
    web_company_name          char(50),
    web_street_number         char(10),
    web_street_name           varchar(60) ,
    web_street_type           char(15),
    web_suite_number          char(10),
    web_city                  varchar(60),
    web_county                varchar(30),
    web_state                 char(2),
    web_zip                   char(10),
    web_country               varchar(20),
    web_gmt_offset            fixnumeric,
    web_tax_percentage        fixnumeric 
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/web_site_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_catalog_returns
(
    cr_returned_date_sk       integer,
    cr_returned_time_sk       integer,
    cr_item_sk                integer,
    cr_refunded_customer_sk   integer,
    cr_refunded_cdemo_sk      integer,
    cr_refunded_hdemo_sk      integer,
    cr_refunded_addr_sk       integer,
    cr_returning_customer_sk  integer,
    cr_returning_cdemo_sk     integer,
    cr_returning_hdemo_sk     integer,
    cr_returning_addr_sk      integer,
    cr_call_center_sk         integer,
    cr_catalog_page_sk        integer,
    cr_ship_mode_sk           integer,
    cr_warehouse_sk           integer,
    cr_reason_sk              integer,
    cr_order_number           bigint,
    cr_return_quantity        integer,
    cr_return_amount          fixnumeric,
    cr_return_tax             fixnumeric,
    cr_return_amt_inc_tax     fixnumeric,
    cr_fee                    fixnumeric,
    cr_return_ship_cost       fixnumeric,
    cr_refunded_cash          fixnumeric,
    cr_reversed_charge        fixnumeric,
    cr_store_credit           fixnumeric,
    cr_net_loss               fixnumeric

SERVER oss_ser
    OPTIONS (
        prefix 'tpcds_1tb/catalog_returns_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_catalog_sales
(
    cs_sold_date_sk           integer,
    cs_sold_time_sk           integer,
    cs_ship_date_sk           integer,
    cs_bill_customer_sk       integer,
    cs_bill_cdemo_sk          integer,
    cs_bill_hdemo_sk          integer,
    cs_bill_addr_sk           integer,
    cs_ship_customer_sk       integer,
    cs_ship_cdemo_sk          integer,
    cs_ship_hdemo_sk          integer,
    cs_ship_addr_sk           integer,
    cs_call_center_sk         integer,
    cs_catalog_page_sk        integer,
    cs_ship_mode_sk           integer,
    cs_warehouse_sk           integer,
    cs_item_sk                integer,
    cs_promo_sk               integer,
    cs_order_number           bigint,
    cs_quantity               integer,
    cs_wholesale_cost         fixnumeric,
    cs_list_price             fixnumeric,
    cs_sales_price            fixnumeric,
    cs_ext_discount_amt       fixnumeric,
    cs_ext_sales_price        fixnumeric,
    cs_ext_wholesale_cost     fixnumeric,
    cs_ext_list_price         fixnumeric,
    cs_ext_tax                fixnumeric,
    cs_coupon_amt             fixnumeric,
    cs_ext_ship_cost          fixnumeric,
    cs_net_paid               fixnumeric,
    cs_net_paid_inc_tax       fixnumeric,
    cs_net_paid_inc_ship      fixnumeric,
    cs_net_paid_inc_ship_tax  fixnumeric,
    cs_net_profit             fixnumeric
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/catalog_sales_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_web_returns
(
    wr_returned_date_sk       integer,
    wr_returned_time_sk       integer,
    wr_item_sk                integer,
    wr_refunded_customer_sk   integer,
    wr_refunded_cdemo_sk      integer,
    wr_refunded_hdemo_sk      integer,
    wr_refunded_addr_sk       integer,
    wr_returning_customer_sk  integer,
    wr_returning_cdemo_sk     integer,
    wr_returning_hdemo_sk     integer,
    wr_returning_addr_sk      integer,
    wr_web_page_sk            integer,
    wr_reason_sk              integer,
    wr_order_number           bigint,
    wr_return_quantity        integer,
    wr_return_amt             fixnumeric,
    wr_return_tax             fixnumeric,
    wr_return_amt_inc_tax     fixnumeric,
    wr_fee                    fixnumeric,
    wr_return_ship_cost       fixnumeric,
    wr_refunded_cash          fixnumeric,
    wr_reversed_charge        fixnumeric,
    wr_account_credit         fixnumeric,
    wr_net_loss               fixnumeric
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/web_returns_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_web_sales
(
    ws_sold_date_sk           integer,
    ws_sold_time_sk           integer,
    ws_ship_date_sk           integer,
    ws_item_sk                integer,
    ws_bill_customer_sk       integer,
    ws_bill_cdemo_sk          integer,
    ws_bill_hdemo_sk          integer,
    ws_bill_addr_sk           integer,
    ws_ship_customer_sk       integer,
    ws_ship_cdemo_sk          integer,
    ws_ship_hdemo_sk          integer,
    ws_ship_addr_sk           integer,
    ws_web_page_sk            integer,
    ws_web_site_sk            integer,
    ws_ship_mode_sk           integer,
    ws_warehouse_sk           integer,
    ws_promo_sk               integer,
    ws_order_number           bigint,
    ws_quantity               integer,
    ws_wholesale_cost         fixnumeric,
    ws_list_price             fixnumeric,
    ws_sales_price            fixnumeric,
    ws_ext_discount_amt       fixnumeric,
    ws_ext_sales_price        fixnumeric,
    ws_ext_wholesale_cost     fixnumeric,
    ws_ext_list_price         fixnumeric,
    ws_ext_tax                fixnumeric,
    ws_coupon_amt             fixnumeric,
    ws_ext_ship_cost          fixnumeric,
    ws_net_paid               fixnumeric,
    ws_net_paid_inc_tax       fixnumeric,
    ws_net_paid_inc_ship      fixnumeric,
    ws_net_paid_inc_ship_tax  fixnumeric,
    ws_net_profit             fixnumeric
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/web_sales_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_store_returns
(
    sr_returned_date_sk       integer,
    sr_return_time_sk         integer,
    sr_item_sk                integer,
    sr_customer_sk            integer,
    sr_cdemo_sk               integer,
    sr_hdemo_sk               integer,
    sr_addr_sk                integer,
    sr_store_sk               integer,
    sr_reason_sk              integer,
    sr_ticket_number          bigint,
    sr_return_quantity        integer,
    sr_return_amt             fixnumeric,
    sr_return_tax             fixnumeric,
    sr_return_amt_inc_tax     fixnumeric,
    sr_fee                    fixnumeric,
    sr_return_ship_cost       fixnumeric,
    sr_refunded_cash          fixnumeric,
    sr_reversed_charge        fixnumeric,
    sr_store_credit           fixnumeric,
    sr_net_loss               fixnumeric
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/store_returns_',
        format 'csv',
        DELIMITER '|'
    );

CREATE FOREIGN TABLE ext_store_sales
(
    ss_sold_date_sk           integer,
    ss_sold_time_sk           integer,
    ss_item_sk                integer NOT NULL,
    ss_customer_sk            integer,
    ss_cdemo_sk               integer,
    ss_hdemo_sk               integer,
    ss_addr_sk                integer,
    ss_store_sk               integer,
    ss_promo_sk               integer,
    ss_ticket_number          bigint NOT NULL,
    ss_quantity               integer,
    ss_wholesale_cost         fixnumeric,
    ss_list_price             fixnumeric,
    ss_sales_price            fixnumeric,
    ss_ext_discount_amt       fixnumeric,
    ss_ext_sales_price        fixnumeric,
    ss_ext_wholesale_cost     fixnumeric,
    ss_ext_list_price         fixnumeric,
    ss_ext_tax                fixnumeric,
    ss_coupon_amt             fixnumeric,
    ss_net_paid               fixnumeric,
    ss_net_paid_inc_tax       fixnumeric,
    ss_net_profit             fixnumeric
)
SERVER oss_serv
    OPTIONS (
        prefix 'tpcds_1tb/store_sales_',
        format 'csv',
        DELIMITER '|'
    );

步骤五:导入数据

  1. 通过OSS外表将数据导入到AnalyticDB PostgreSQL数据表中。

    insert into call_center            select * from ext_call_center;           
    insert into catalog_page           select * from ext_catalog_page;          
    insert into catalog_returns        select * from ext_catalog_returns;       
    insert into catalog_sales          select * from ext_catalog_sales;         
    insert into customer               select * from ext_customer;              
    insert into customer_address       select * from ext_customer_address;      
    insert into customer_demographics  select * from ext_customer_demographics; 
    insert into date_dim               select * from ext_date_dim;                         
    insert into household_demographics select * from ext_household_demographics;
    insert into income_band            select * from ext_income_band;          
    insert into inventory              select * from ext_inventory;             
    insert into item                   select * from ext_item;                  
    insert into promotion              select * from ext_promotion;             
    insert into reason                 select * from ext_reason;                
    insert into ship_mode              select * from ext_ship_mode;             
    insert into store                  select * from ext_store;                 
    insert into store_returns          select * from ext_store_returns;         
    insert into store_sales            select * from ext_store_sales;           
    insert into time_dim               select * from ext_time_dim;              
    insert into warehouse              select * from ext_warehouse;             
    insert into web_page               select * from ext_web_page;
    insert into web_returns            select * from ext_web_returns;           
    insert into web_sales              select * from ext_web_sales;             
    insert into web_site               select * from ext_web_site;
  2. 优化表并收集统计信息。

    optimize call_center           ;
    optimize catalog_page          ;
    optimize catalog_returns       ;
    optimize catalog_sales         ;
    optimize customer              ;
    optimize customer_address      ;
    optimize customer_demographics ;
    optimize date_dim              ;
    optimize household_demographics;
    optimize income_band           ;
    optimize inventory             ;
    optimize item                  ;
    optimize promotion             ;
    optimize reason                ;
    optimize ship_mode             ;
    optimize store                 ;
    optimize store_returns         ;
    optimize store_sales           ;
    optimize time_dim              ;
    optimize warehouse             ;
    optimize web_page              ;
    optimize web_returns           ;
    optimize web_sales             ;
    optimize web_site              ;
    
    FLUSH call_center           ;
    FLUSH catalog_page          ;
    FLUSH catalog_returns       ;
    FLUSH catalog_sales         ;
    FLUSH customer              ;
    FLUSH customer_address      ;
    FLUSH customer_demographics ;
    FLUSH date_dim              ;
    FLUSH household_demographics;
    FLUSH income_band           ;
    FLUSH inventory             ;
    FLUSH item                  ;
    FLUSH promotion             ;
    FLUSH reason                ;
    FLUSH ship_mode             ;
    FLUSH store                 ;
    FLUSH store_returns         ;
    FLUSH store_sales           ;
    FLUSH time_dim              ;
    FLUSH warehouse             ;
    FLUSH web_page              ;
    FLUSH web_returns           ;
    FLUSH web_sales             ;
    FLUSH web_site              ;
    
    analyze fullscan call_center           ;
    analyze fullscan catalog_page          ;
    analyze fullscan catalog_returns       ;
    analyze fullscan catalog_sales         ;
    analyze fullscan customer              ;
    analyze fullscan customer_address      ;
    analyze fullscan customer_demographics ;
    analyze fullscan date_dim              ;
    analyze fullscan household_demographics;
    analyze fullscan income_band           ;
    analyze fullscan inventory             ;
    analyze fullscan item                  ;
    analyze fullscan promotion             ;
    analyze fullscan reason                ;
    analyze fullscan ship_mode             ;
    analyze fullscan store                 ;
    analyze fullscan store_returns         ;
    analyze fullscan store_sales           ;
    analyze fullscan time_dim              ;
    analyze fullscan warehouse             ;
    analyze fullscan web_page              ;
    analyze fullscan web_returns           ;
    analyze fullscan web_sales             ;
    analyze fullscan web_site              ;
  3. 在导入数据之后,后台会对表采集统计信息并做排序。执行以下查询,若不返回结果,即可进行测试。

    select datname, query, state from pg_stat_activity 
      where state <> 'idle' and backend_type like '%autovacuum%';

数据量参考

导入完成后,各表的数据量参考如下:

表名

100 GB行数

1 TB行数

call_center

30

42

catalog_page

20400

30000

catalog_returns

14404374

143996756

catalog_sales

143997065

1439980416

customer

2000000

12000000

customer_address

1000000

6000000

customer_demographics

1920800

1920800

date_dim

73049

73049

household_demographics

7200

7200

income_band

20

20

inventory

399330000

783000000

item

204000

300000

promotion

1000

1500

reason

55

65

ship_mode

20

20

store

402

1002

store_returns

28795080

287999764

store_sales

287997024

2879987999

time_dim

86400

86400

warehouse

15

20

web_page

2040

3000

web_returns

7197670

71997522

web_sales

72001237

720000376

web_site

24

54

步骤六:生成测试SQL

使用TPC-DS工具包中的dsqgen生成99条测试SQL。

for((i=1;i<=99;i++));
do
./dsqgen -SCALE 100 -DIRECTORY $DIR/tpcds-kit/query_templates/ -TEMPLATE query$i.tpl -FILTER Y -DIALECT netezza > ../query/q$i.sql
done

步骤七:执行开箱配置

请将your_db替换为实际的数据库名称。

ALTER DATABASE your_db set laser.enable_vectorized_engine to on;
ALTER DATABASE your_db set statement_mem = '16GB';

步骤八:运行测试

  1. 创建测试目录,并将测试脚本和SQL文件放入对应目录。

    .
    ├── benchmark.sh
    └── query
        ├── q1.sql
        ├── q2.sql
        ├── ...
        └── q99.sql
    • 保存以下测试脚本至./benchmark/benchmark.sh

      #!/bin/bash
      
      usage() {
        cat <<EOF
      Usage:
        $0 <database> <round> <host>
      
      Params:
        database  必填,数据库名
        round     必填,执行轮数(正整数)
        host      必填,数据库主机地址,如 127.0.0.1 或 hostname
      
      Examples:
        # 在本机执行 adbpg,执行 3 轮
        $0 tpcds_db 3 127.0.0.1
      
        # 在远程主机执行 1 轮
        $0 tpcds_db 1 10.0.0.5
      EOF
      }
      
      # 参数: $1=database, $2=round, $3=host
      DATABASE="$1"
      ROUND="${2:-1}"
      HOST="$3"
      
      # 参数检查
      if [ -z "$DATABASE" ] || [ -z "$ROUND" ] || [ -z "$HOST" ]; then
        usage
        exit 1
      fi
      
      echo "Database: $DATABASE"
      echo "Round: $ROUND"
      echo "Host: $HOST"
      
      HOME_PATH=$(pwd)
      QUERY_PATH="$HOME_PATH/query/"
      LOG_PATH="$HOME_PATH/log/"
      mkdir -p "$LOG_PATH"
      
      timestamp=$(date +%Y%m%d_%H%M%S)
      log_file="$HOME_PATH/tpcds_query_rt_${timestamp}.csv"
      echo "query,cost,total_cost" > "$log_file"
      
      total_cost=0
      for ((i=1; i<=99; i++)); do
          last_cost=0
          for ((r=1; r<=ROUND; r++)); do
              echo "begin run TPC-DS Q${i} round ${r}/$ROUND on adbpg (db: $DATABASE, host: $HOST), $(date)"
              begin_time=$(date +%s.%N)
      
              psql -h "$HOST" -d "$DATABASE" -f "$QUERY_PATH/q$i.sql" > "$LOG_PATH/log${i}_r${r}.out"
              rc=$?
      
              end_time=$(date +%s.%N)
              last_cost=$(awk "BEGIN {print $end_time - $begin_time}")
              if [ $rc -ne 0 ] ; then
                  printf "run TPC-DS Q%s round %s on adbpg fail, cost: %.2f, %s\n" "$i" "$r" "$last_cost" "$(date)"
              else
                  printf "run TPC-DS Q%s round %s on adbpg succ, cost: %.2f, %s\n" "$i" "$r" "$last_cost" "$(date)"
              fi
          done
          cost=$last_cost
          total_cost=$(awk "BEGIN {print $total_cost + $cost}")
          printf "TPC-DS Q%s final (round %d): cost: %.2f, totalCost: %.2f, %s\n" "$i" "$ROUND" "$cost" "$total_cost" "$(date)"
          printf "Q%s,%.2f,%.2f\n" "$i" "$cost" "$total_cost" >> "$log_file"
      done
    • 将通过dsqgen生成的99SQL保存至./benchmark/query/目录下(文件名为q1.sqlq99.sql)。

  2. 进入benchmark目录,执行以下命令运行测试。

    sh benchmark.sh [database] 2 [host]

    请将[database]替换为实际的数据库名称,将[host]替换为实际的数据库主机地址。