本文为您介绍如何使用TPC-DS(决策支持基准测试)对Hologres的OLAP查询场景进行性能测试。
TPC-DS简介
TPC-DS是由TPC(Transaction Processing Performance Council)组织制定的决策支持基准测试,用于模拟复杂的决策支持系统场景,包含多维度分析查询。
TPC-DS基准测试具有以下特征:
-
包含24张表,其中7张事实表和17张维表,覆盖零售业务的多个分析维度。
-
包含99条复杂OLAP查询语句,涉及多表关联、子查询、窗口函数等高级SQL特性。
-
主要评价指标为各查询的响应时间,用于衡量系统在复杂分析场景下的处理能力。
本文的TPC-DS实现基于TPC-DS的基准测试,并不能与已发布的TPC-DS基准测试结果相比较,本文中的测试并不符合TPC-DS基准测试的所有要求。
数据集简介
TPC-DS数据集包含24张表,分为事实表和维表两类。
事实表(7张)
|
表名 |
说明 |
|
STORE_SALES |
门店销售事实表 |
|
STORE_RETURNS |
门店退货事实表 |
|
CATALOG_SALES |
目录销售事实表 |
|
CATALOG_RETURNS |
目录退货事实表 |
|
WEB_SALES |
网络销售事实表 |
|
WEB_RETURNS |
网络退货事实表 |
|
INVENTORY |
库存事实表 |
维表(17张)
|
表名 |
说明 |
|
CUSTOMER_ADDRESS |
客户地址维表 |
|
CUSTOMER_DEMOGRAPHICS |
客户人口统计维表 |
|
DATE_DIM |
日期维表 |
|
WAREHOUSE |
仓库维表 |
|
SHIP_MODE |
物流方式维表 |
|
TIME_DIM |
时间维表 |
|
REASON |
退货原因维表 |
|
INCOME_BAND |
收入区间维表 |
|
ITEM |
商品维表 |
|
STORE |
门店维表 |
|
CALL_CENTER |
呼叫中心维表 |
|
CUSTOMER |
客户维表 |
|
WEB_SITE |
网站维表 |
|
HOUSEHOLD_DEMOGRAPHICS |
家庭人口统计维表 |
|
WEB_PAGE |
网页维表 |
|
PROMOTION |
促销维表 |
|
CATALOG_PAGE |
目录页维表 |
表结构定义(DDL)
以下为24张TPC-DS表的完整建表语句,包含表结构定义及Hologres表属性配置。
CUSTOMER_ADDRESS
CREATE TABLE CUSTOMER_ADDRESS
(
CA_ADDRESS_SK INT NOT NULL,
ca_address_id TEXT NOT NULL,
ca_street_number TEXT ,
CA_STREET_NAME TEXT ,
CA_STREET_TYPE TEXT ,
CA_SUITE_NUMBER TEXT ,
CA_CITY TEXT ,
CA_COUNTY TEXT ,
CA_STATE TEXT ,
CA_ZIP TEXT ,
CA_COUNTRY TEXT ,
CA_GMT_OFFSET DECIMAL(5,2) ,
CA_LOCATION_TYPE TEXT ,
PRIMARY KEY (CA_ADDRESS_SK)
)
WITH (
shard_count = '64',
distribution_key = 'CA_ADDRESS_SK',
clustering_key = 'CA_ADDRESS_SK'
);
CUSTOMER_DEMOGRAPHICS
CREATE TABLE CUSTOMER_DEMOGRAPHICS
(
CD_DEMO_SK INT NOT NULL,
CD_GENDER TEXT ,
CD_MARITAL_STATUS TEXT ,
CD_EDUCATION_STATUS TEXT ,
CD_PURCHASE_ESTIMATE INTEGER ,
CD_CREDIT_RATING TEXT ,
CD_DEP_COUNT INTEGER ,
CD_DEP_EMPLOYED_COUNT INTEGER ,
CD_DEP_COLLEGE_COUNT INTEGER ,
PRIMARY KEY (CD_DEMO_SK)
)
WITH (
distribution_key = 'CD_DEMO_SK',
clustering_key = 'CD_DEMO_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
DATE_DIM
CREATE TABLE DATE_DIM
(
D_DATE_SK INT NOT NULL,
D_DATE_ID TEXT NOT NULL,
D_DATE DATE NOT NULL,
D_MONTH_SEQ INTEGER ,
D_WEEK_SEQ INTEGER ,
D_QUARTER_SEQ INTEGER ,
D_YEAR INTEGER ,
D_DOW INTEGER ,
D_MOY INTEGER ,
D_DOM INTEGER ,
D_QOY INTEGER ,
D_FY_YEAR INTEGER ,
D_FY_QUARTER_SEQ INTEGER ,
D_FY_WEEK_SEQ INTEGER ,
D_DAY_NAME TEXT ,
D_QUARTER_NAME TEXT ,
D_HOLIDAY TEXT ,
D_WEEKEND TEXT ,
D_FOLLOWING_HOLIDAY TEXT ,
D_FIRST_DOM INTEGER ,
D_LAST_DOM INTEGER ,
D_SAME_DAY_LY INTEGER ,
D_SAME_DAY_LQ INTEGER ,
D_CURRENT_DAY TEXT ,
D_CURRENT_WEEK TEXT ,
D_CURRENT_MONTH TEXT ,
D_CURRENT_QUARTER TEXT ,
D_CURRENT_YEAR TEXT ,
PRIMARY KEY (D_DATE_SK)
)
WITH (
distribution_key = 'D_DATE_SK',
clustering_key = 'D_DATE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
WAREHOUSE
CREATE TABLE WAREHOUSE
(
W_WAREHOUSE_SK INT NOT NULL,
W_WAREHOUSE_ID TEXT NOT NULL,
W_WAREHOUSE_NAME TEXT ,
W_WAREHOUSE_SQ_FT INTEGER ,
W_STREET_NUMBER TEXT ,
W_STREET_NAME TEXT ,
W_STREET_TYPE TEXT ,
W_SUITE_NUMBER TEXT ,
W_CITY TEXT ,
W_COUNTY TEXT ,
W_STATE TEXT ,
W_ZIP TEXT ,
W_COUNTRY TEXT ,
W_GMT_OFFSET DECIMAL(5,2) ,
PRIMARY KEY (W_WAREHOUSE_SK)
)
WITH (
distribution_key = 'W_WAREHOUSE_SK',
clustering_key = 'W_WAREHOUSE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
SHIP_MODE
CREATE TABLE SHIP_MODE
(
SM_SHIP_MODE_SK INT NOT NULL,
SM_SHIP_MODE_ID TEXT NOT NULL,
SM_TYPE TEXT ,
SM_CODE TEXT ,
SM_CARRIER TEXT ,
SM_CONTRACT TEXT ,
PRIMARY KEY (SM_SHIP_MODE_SK)
)
WITH (
distribution_key = 'SM_SHIP_MODE_SK',
clustering_key = 'SM_SHIP_MODE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
TIME_DIM
CREATE TABLE TIME_DIM
(
T_TIME_SK INT NOT NULL,
T_TIME_ID TEXT NOT NULL,
T_TIME INTEGER NOT NULL,
T_HOUR INTEGER ,
T_MINUTE INTEGER ,
T_SECOND INTEGER ,
T_AM_PM TEXT ,
T_SHIFT TEXT ,
T_SUB_SHIFT TEXT ,
T_MEAL_TIME TEXT ,
PRIMARY KEY (T_TIME_SK)
)
WITH (
distribution_key = 'T_TIME_SK',
clustering_key = 'T_TIME_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
REASON
CREATE TABLE REASON
(
R_REASON_SK INT NOT NULL,
R_REASON_ID TEXT NOT NULL,
R_REASON_DESC TEXT ,
PRIMARY KEY (R_REASON_SK)
)
WITH (
distribution_key = 'R_REASON_SK',
clustering_key = 'R_REASON_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
INCOME_BAND
CREATE TABLE INCOME_BAND
(
IB_INCOME_BAND_SK INT NOT NULL,
IB_LOWER_BOUND INTEGER ,
IB_UPPER_BOUND INTEGER ,
PRIMARY KEY (IB_INCOME_BAND_SK)
)
WITH (
distribution_key = 'IB_INCOME_BAND_SK',
clustering_key = 'IB_INCOME_BAND_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
ITEM
CREATE TABLE ITEM
(
I_ITEM_SK INT NOT NULL,
I_ITEM_ID TEXT NOT NULL,
I_REC_START_DATE DATE ,
I_REC_END_DATE DATE ,
I_ITEM_DESC TEXT ,
I_CURRENT_PRICE DECIMAL(7,2) ,
I_WHOLESALE_COST DECIMAL(7,2) ,
I_BRAND_ID INTEGER ,
I_BRAND TEXT ,
I_CLASS_ID INTEGER ,
I_CLASS TEXT ,
I_CATEGORY_ID INTEGER ,
I_CATEGORY TEXT ,
I_MANUFACT_ID INTEGER ,
I_MANUFACT TEXT ,
I_SIZE TEXT ,
I_FORMULATION TEXT ,
I_COLOR TEXT ,
I_UNITS TEXT ,
I_CONTAINER TEXT ,
I_MANAGER_ID INTEGER ,
I_PRODUCT_NAME TEXT ,
PRIMARY KEY (I_ITEM_SK)
)
WITH (
distribution_key = 'I_ITEM_SK',
clustering_key = 'I_ITEM_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
STORE
CREATE TABLE STORE
(
S_STORE_SK INT NOT NULL,
S_STORE_ID TEXT NOT NULL,
S_REC_START_DATE DATE ,
S_REC_END_DATE DATE ,
S_CLOSED_DATE_SK INT ,
S_STORE_NAME TEXT ,
S_NUMBER_EMPLOYEES INTEGER ,
S_FLOOR_SPACE INTEGER ,
S_HOURS TEXT ,
S_MANAGER TEXT ,
S_MARKET_ID INTEGER ,
S_GEOGRAPHY_CLASS TEXT ,
S_MARKET_DESC TEXT ,
S_MARKET_MANAGER TEXT ,
S_DIVISION_ID INTEGER ,
S_DIVISION_NAME TEXT ,
S_COMPANY_ID INTEGER ,
S_COMPANY_NAME TEXT ,
S_STREET_NUMBER TEXT ,
S_STREET_NAME TEXT ,
S_STREET_TYPE TEXT ,
S_SUITE_NUMBER TEXT ,
S_CITY TEXT ,
S_COUNTY TEXT ,
S_STATE TEXT ,
S_ZIP TEXT ,
S_COUNTRY TEXT ,
S_GMT_OFFSET DECIMAL(5,2) ,
S_TAX_PERCENTAGE DECIMAL(5,2) ,
PRIMARY KEY (S_STORE_SK)
)
WITH (
distribution_key = 'S_STORE_SK',
clustering_key = 'S_STORE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
CALL_CENTER
CREATE TABLE CALL_CENTER
(
CC_CALL_CENTER_SK INT NOT NULL,
CC_CALL_CENTER_ID TEXT NOT NULL,
CC_REC_START_DATE DATE ,
CC_REC_END_DATE DATE ,
CC_CLOSED_DATE_SK INT ,
CC_OPEN_DATE_SK INT ,
CC_NAME TEXT ,
CC_CLASS TEXT ,
CC_EMPLOYEES INTEGER ,
CC_SQ_FT INTEGER ,
CC_HOURS TEXT ,
CC_MANAGER TEXT ,
CC_MKT_ID INTEGER ,
CC_MKT_CLASS TEXT ,
CC_MKT_DESC TEXT ,
CC_MARKET_MANAGER TEXT ,
CC_DIVISION INTEGER ,
CC_DIVISION_NAME TEXT ,
CC_COMPANY INTEGER ,
CC_COMPANY_NAME TEXT ,
CC_STREET_NUMBER TEXT ,
CC_STREET_NAME TEXT ,
CC_STREET_TYPE TEXT ,
CC_SUITE_NUMBER TEXT ,
CC_CITY TEXT ,
CC_COUNTY TEXT ,
CC_STATE TEXT ,
CC_ZIP TEXT ,
CC_COUNTRY TEXT ,
CC_GMT_OFFSET DECIMAL(5,2) ,
CC_TAX_PERCENTAGE DECIMAL(5,2) ,
PRIMARY KEY (CC_CALL_CENTER_SK)
)
WITH (
distribution_key = 'CC_CALL_CENTER_SK',
clustering_key = 'CC_CALL_CENTER_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
CUSTOMER
CREATE TABLE CUSTOMER
(
C_CUSTOMER_SK INT NOT NULL,
C_CUSTOMER_ID TEXT NOT NULL,
C_CURRENT_CDEMO_SK INT ,
C_CURRENT_HDEMO_SK INT ,
C_CURRENT_ADDR_SK INT ,
C_FIRST_SHIPTO_DATE_SK INT ,
C_FIRST_SALES_DATE_SK INT ,
C_SALUTATION TEXT ,
C_FIRST_NAME TEXT ,
C_LAST_NAME TEXT ,
C_PREFERRED_CUST_FLAG TEXT ,
C_BIRTH_DAY INTEGER ,
C_BIRTH_MONTH INTEGER ,
C_BIRTH_YEAR INTEGER ,
C_BIRTH_COUNTRY TEXT ,
C_LOGIN TEXT ,
C_EMAIL_ADDRESS TEXT ,
C_LAST_REVIEW_DATE_SK INT ,
PRIMARY KEY (C_CUSTOMER_SK)
)
WITH (
distribution_key = 'C_CUSTOMER_SK',
clustering_key = 'C_CUSTOMER_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
WEB_SITE
CREATE TABLE WEB_SITE
(
WEB_SITE_SK INT NOT NULL,
WEB_SITE_ID TEXT NOT NULL,
WEB_REC_START_DATE DATE ,
WEB_REC_END_DATE DATE ,
WEB_NAME TEXT ,
WEB_OPEN_DATE_SK INT ,
WEB_CLOSE_DATE_SK INT ,
WEB_CLASS TEXT ,
WEB_MANAGER TEXT ,
WEB_MKT_ID INTEGER ,
WEB_MKT_CLASS TEXT ,
WEB_MKT_DESC TEXT ,
WEB_MARKET_MANAGER TEXT ,
WEB_COMPANY_ID INTEGER ,
WEB_COMPANY_NAME TEXT ,
WEB_STREET_NUMBER TEXT ,
WEB_STREET_NAME TEXT ,
WEB_STREET_TYPE TEXT ,
WEB_SUITE_NUMBER TEXT ,
WEB_CITY TEXT ,
WEB_COUNTY TEXT ,
WEB_STATE TEXT ,
WEB_ZIP TEXT ,
WEB_COUNTRY TEXT ,
WEB_GMT_OFFSET DECIMAL(5,2) ,
WEB_TAX_PERCENTAGE DECIMAL(5,2) ,
PRIMARY KEY (WEB_SITE_SK)
)
WITH (
distribution_key = 'WEB_SITE_SK',
clustering_key = 'WEB_SITE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
STORE_RETURNS
CREATE TABLE STORE_RETURNS
(
SR_RETURNED_DATE_SK INT ,
SR_RETURN_TIME_SK INT ,
SR_ITEM_SK INT NOT NULL,
SR_CUSTOMER_SK INT ,
SR_CDEMO_SK INT ,
SR_HDEMO_SK INT ,
SR_ADDR_SK INT ,
SR_STORE_SK INT ,
SR_REASON_SK INT ,
SR_TICKET_NUMBER INT NOT NULL,
SR_RETURN_QUANTITY INTEGER ,
SR_RETURN_AMT DECIMAL(7,2) ,
SR_RETURN_TAX DECIMAL(7,2) ,
SR_RETURN_AMT_INC_TAX DECIMAL(7,2) ,
SR_FEE DECIMAL(7,2) ,
SR_RETURN_SHIP_COST DECIMAL(7,2) ,
SR_REFUNDED_CASH DECIMAL(7,2) ,
SR_REVERSED_CHARGE DECIMAL(7,2) ,
SR_STORE_CREDIT DECIMAL(7,2) ,
SR_NET_LOSS DECIMAL(7,2)
)
WITH (
distribution_key = 'SR_ITEM_SK',
clustering_key = 'sr_returned_date_sk,SR_ITEM_SK,SR_TICKET_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
HOUSEHOLD_DEMOGRAPHICS
CREATE TABLE HOUSEHOLD_DEMOGRAPHICS
(
HD_DEMO_SK INT NOT NULL,
HD_INCOME_BAND_SK INT ,
HD_BUY_POTENTIAL TEXT ,
HD_DEP_COUNT INTEGER ,
HD_VEHICLE_COUNT INTEGER ,
PRIMARY KEY (HD_DEMO_SK)
)
WITH (
distribution_key = 'HD_DEMO_SK',
clustering_key = 'HD_DEMO_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
WEB_PAGE
CREATE TABLE WEB_PAGE
(
WP_WEB_PAGE_SK INT NOT NULL,
WP_WEB_PAGE_ID TEXT NOT NULL,
WP_REC_START_DATE DATE ,
WP_REC_END_DATE DATE ,
WP_CREATION_DATE_SK INT ,
WP_ACCESS_DATE_SK INT ,
WP_AUTOGEN_FLAG TEXT ,
WP_CUSTOMER_SK INT ,
WP_URL TEXT ,
WP_TYPE TEXT ,
WP_CHAR_COUNT INTEGER ,
WP_LINK_COUNT INTEGER ,
WP_IMAGE_COUNT INTEGER ,
WP_MAX_AD_COUNT INTEGER ,
PRIMARY KEY (WP_WEB_PAGE_SK)
)
WITH (
distribution_key = 'WP_WEB_PAGE_SK',
clustering_key = 'WP_WEB_PAGE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
PROMOTION
CREATE TABLE PROMOTION
(
P_PROMO_SK INT NOT NULL,
P_PROMO_ID TEXT NOT NULL,
P_START_DATE_SK INT ,
P_END_DATE_SK INT ,
P_ITEM_SK INT ,
P_COST DECIMAL(15,2) ,
P_RESPONSE_TARGET INTEGER ,
P_PROMO_NAME TEXT ,
P_CHANNEL_DMAIL TEXT ,
P_CHANNEL_EMAIL TEXT ,
P_CHANNEL_CATALOG TEXT ,
P_CHANNEL_TV TEXT ,
P_CHANNEL_RADIO TEXT ,
P_CHANNEL_PRESS TEXT ,
P_CHANNEL_EVENT TEXT ,
P_CHANNEL_DEMO TEXT ,
P_CHANNEL_DETAILS TEXT ,
P_PURPOSE TEXT ,
P_DISCOUNT_ACTIVE TEXT ,
PRIMARY KEY (P_PROMO_SK)
)
WITH (
distribution_key = 'P_PROMO_SK',
clustering_key = 'P_PROMO_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
CATALOG_PAGE
CREATE TABLE CATALOG_PAGE
(
CP_CATALOG_PAGE_SK INT NOT NULL,
CP_CATALOG_PAGE_ID TEXT NOT NULL,
CP_START_DATE_SK INT ,
CP_END_DATE_SK INT ,
CP_DEPARTMENT TEXT ,
CP_CATALOG_NUMBER INTEGER ,
CP_CATALOG_PAGE_NUMBER INTEGER ,
CP_DESCRIPTION TEXT ,
CP_TYPE TEXT ,
PRIMARY KEY (CP_CATALOG_PAGE_SK)
)
WITH (
distribution_key = 'CP_CATALOG_PAGE_SK',
clustering_key = 'CP_CATALOG_PAGE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
INVENTORY
CREATE TABLE INVENTORY
(
INV_DATE_SK INT NOT NULL,
INV_ITEM_SK INT NOT NULL,
INV_WAREHOUSE_SK INT NOT NULL,
INV_QUANTITY_ON_HAND INTEGER
)
WITH (
distribution_key = 'inv_item_sk',
clustering_key = 'inv_date_sk, INV_ITEM_SK, INV_WAREHOUSE_SK',
colocate_with = 'CUSTOMER_ADDRESS'
);
CATALOG_RETURNS
CREATE TABLE CATALOG_RETURNS
(
CR_RETURNED_DATE_SK INT ,
CR_RETURNED_TIME_SK INT ,
CR_ITEM_SK INT NOT NULL,
CR_REFUNDED_CUSTOMER_SK INT ,
CR_REFUNDED_CDEMO_SK INT ,
CR_REFUNDED_HDEMO_SK INT ,
CR_REFUNDED_ADDR_SK INT ,
CR_RETURNING_CUSTOMER_SK INT ,
CR_RETURNING_CDEMO_SK INT ,
CR_RETURNING_HDEMO_SK INT ,
CR_RETURNING_ADDR_SK INT ,
CR_CALL_CENTER_SK INT ,
CR_CATALOG_PAGE_SK INT ,
CR_SHIP_MODE_SK INT ,
CR_WAREHOUSE_SK INT ,
CR_REASON_SK INT ,
CR_ORDER_NUMBER INT NOT NULL,
CR_RETURN_QUANTITY INTEGER ,
CR_RETURN_AMOUNT DECIMAL(7,2) ,
CR_RETURN_TAX DECIMAL(7,2) ,
CR_RETURN_AMT_INC_TAX DECIMAL(7,2) ,
CR_FEE DECIMAL(7,2) ,
CR_RETURN_SHIP_COST DECIMAL(7,2) ,
CR_REFUNDED_CASH DECIMAL(7,2) ,
CR_REVERSED_CHARGE DECIMAL(7,2) ,
CR_STORE_CREDIT DECIMAL(7,2) ,
CR_NET_LOSS DECIMAL(7,2)
)
WITH (
distribution_key = 'CR_ITEM_SK',
clustering_key = 'cr_returned_date_sk,CR_ITEM_SK,CR_ORDER_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
WEB_RETURNS
CREATE TABLE WEB_RETURNS
(
WR_RETURNED_DATE_SK INT ,
WR_RETURNED_TIME_SK INT ,
WR_ITEM_SK INT NOT NULL,
WR_REFUNDED_CUSTOMER_SK INT ,
WR_REFUNDED_CDEMO_SK INT ,
WR_REFUNDED_HDEMO_SK INT ,
WR_REFUNDED_ADDR_SK INT ,
WR_RETURNING_CUSTOMER_SK INT ,
WR_RETURNING_CDEMO_SK INT ,
WR_RETURNING_HDEMO_SK INT ,
WR_RETURNING_ADDR_SK INT ,
WR_WEB_PAGE_SK INT ,
WR_REASON_SK INT ,
WR_ORDER_NUMBER INT NOT NULL,
WR_RETURN_QUANTITY INTEGER ,
WR_RETURN_AMT DECIMAL(7,2) ,
WR_RETURN_TAX DECIMAL(7,2) ,
WR_RETURN_AMT_INC_TAX DECIMAL(7,2) ,
WR_FEE DECIMAL(7,2) ,
WR_RETURN_SHIP_COST DECIMAL(7,2) ,
WR_REFUNDED_CASH DECIMAL(7,2) ,
WR_REVERSED_CHARGE DECIMAL(7,2) ,
WR_ACCOUNT_CREDIT DECIMAL(7,2) ,
WR_NET_LOSS DECIMAL(7,2)
)
WITH (
distribution_key = 'WR_ITEM_SK',
clustering_key = 'wr_returned_date_sk, WR_ITEM_SK,WR_ORDER_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
WEB_SALES
CREATE TABLE WEB_SALES
(
WS_SOLD_DATE_SK INT ,
WS_SOLD_TIME_SK INT ,
WS_SHIP_DATE_SK INT ,
WS_ITEM_SK INT NOT NULL,
WS_BILL_CUSTOMER_SK INT ,
WS_BILL_CDEMO_SK INT ,
WS_BILL_HDEMO_SK INT ,
WS_BILL_ADDR_SK INT ,
WS_SHIP_CUSTOMER_SK INT ,
WS_SHIP_CDEMO_SK INT ,
WS_SHIP_HDEMO_SK INT ,
WS_SHIP_ADDR_SK INT ,
WS_WEB_PAGE_SK INT ,
WS_WEB_SITE_SK INT ,
WS_SHIP_MODE_SK INT ,
WS_WAREHOUSE_SK INT ,
WS_PROMO_SK INT ,
WS_ORDER_NUMBER INT NOT NULL,
WS_QUANTITY INTEGER ,
WS_WHOLESALE_COST DECIMAL(7,2) ,
WS_LIST_PRICE DECIMAL(7,2) ,
WS_SALES_PRICE DECIMAL(7,2) ,
WS_EXT_DISCOUNT_AMT DECIMAL(7,2) ,
WS_EXT_SALES_PRICE DECIMAL(7,2) ,
WS_EXT_WHOLESALE_COST DECIMAL(7,2) ,
WS_EXT_LIST_PRICE DECIMAL(7,2) ,
WS_EXT_TAX DECIMAL(7,2) ,
WS_COUPON_AMT DECIMAL(7,2) ,
WS_EXT_SHIP_COST DECIMAL(7,2) ,
WS_NET_PAID DECIMAL(7,2) ,
WS_NET_PAID_INC_TAX DECIMAL(7,2) ,
WS_NET_PAID_INC_SHIP DECIMAL(7,2) ,
WS_NET_PAID_INC_SHIP_TAX DECIMAL(7,2) ,
WS_NET_PROFIT DECIMAL(7,2)
)
WITH (
distribution_key = 'WS_ITEM_SK',
clustering_key = 'ws_sold_date_sk,WS_ITEM_SK,WS_ORDER_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
CATALOG_SALES
CREATE TABLE CATALOG_SALES
(
CS_SOLD_DATE_SK INT ,
CS_SOLD_TIME_SK INT ,
CS_SHIP_DATE_SK INT ,
CS_BILL_CUSTOMER_SK INT ,
CS_BILL_CDEMO_SK INT ,
CS_BILL_HDEMO_SK INT ,
CS_BILL_ADDR_SK INT ,
CS_SHIP_CUSTOMER_SK INT ,
CS_SHIP_CDEMO_SK INT ,
CS_SHIP_HDEMO_SK INT ,
CS_SHIP_ADDR_SK INT ,
CS_CALL_CENTER_SK INT ,
CS_CATALOG_PAGE_SK INT ,
CS_SHIP_MODE_SK INT ,
CS_WAREHOUSE_SK INT ,
CS_ITEM_SK INT NOT NULL,
CS_PROMO_SK INT ,
CS_ORDER_NUMBER INT NOT NULL,
CS_QUANTITY INTEGER ,
CS_WHOLESALE_COST DECIMAL(7,2) ,
CS_LIST_PRICE DECIMAL(7,2) ,
CS_SALES_PRICE DECIMAL(7,2) ,
CS_EXT_DISCOUNT_AMT DECIMAL(7,2) ,
CS_EXT_SALES_PRICE DECIMAL(7,2) ,
CS_EXT_WHOLESALE_COST DECIMAL(7,2) ,
CS_EXT_LIST_PRICE DECIMAL(7,2) ,
CS_EXT_TAX DECIMAL(7,2) ,
CS_COUPON_AMT DECIMAL(7,2) ,
CS_EXT_SHIP_COST DECIMAL(7,2) ,
CS_NET_PAID DECIMAL(7,2) ,
CS_NET_PAID_INC_TAX DECIMAL(7,2) ,
CS_NET_PAID_INC_SHIP DECIMAL(7,2) ,
CS_NET_PAID_INC_SHIP_TAX DECIMAL(7,2) ,
CS_NET_PROFIT DECIMAL(7,2)
)
WITH (
distribution_key = 'CS_ITEM_SK',
clustering_key = 'cs_sold_date_sk,CS_ITEM_SK,CS_ORDER_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
STORE_SALES
CREATE TABLE STORE_SALES
(
SS_SOLD_DATE_SK INT ,
SS_SOLD_TIME_SK INT ,
SS_ITEM_SK INT NOT NULL,
SS_CUSTOMER_SK INT ,
SS_CDEMO_SK INT ,
SS_HDEMO_SK INT ,
SS_ADDR_SK INT ,
SS_STORE_SK INT ,
SS_PROMO_SK INT ,
SS_TICKET_NUMBER INT NOT NULL,
SS_QUANTITY INTEGER ,
SS_WHOLESALE_COST DECIMAL(7,2) ,
SS_LIST_PRICE DECIMAL(7,2) ,
SS_SALES_PRICE DECIMAL(7,2) ,
SS_EXT_DISCOUNT_AMT DECIMAL(7,2) ,
SS_EXT_SALES_PRICE DECIMAL(7,2) ,
SS_EXT_WHOLESALE_COST DECIMAL(7,2) ,
SS_EXT_LIST_PRICE DECIMAL(7,2) ,
SS_EXT_TAX DECIMAL(7,2) ,
SS_COUPON_AMT DECIMAL(7,2) ,
SS_NET_PAID DECIMAL(7,2) ,
SS_NET_PAID_INC_TAX DECIMAL(7,2) ,
SS_NET_PROFIT DECIMAL(7,2)
)
WITH (
distribution_key = 'SS_ITEM_SK',
clustering_key = 'ss_sold_date_sk, SS_ITEM_SK,SS_TICKET_NUMBER',
colocate_with = 'CUSTOMER_ADDRESS'
);
数据规模
数据量通过scale_factor参数控制,1 SF对应1 GB原始数据。默认值为1000,即生成1 TB原始数据。
以上数据量仅针对原始数据,不包括索引等空间占用,准备环境时需预留更多空间。
测试环境准备
Hologres实例要求
|
参数 |
要求 |
说明 |
|
版本 |
Hologres V4.2及以上 |
低版本不支持部分语法和功能。 |
|
计算资源 |
64 core及以上 |
推荐 96CU。 |
|
付费模式 |
独享(按量付费)实例 |
确保资源独占,测试结果稳定。 |
ECS实例要求
|
参数 |
规格 |
说明 |
|
实例规格 |
ecs.g6.4xlarge或以上 |
确保数据生成和导入性能。 |
|
镜像 |
Alibaba Cloud Linux 3.2104 LTS 64位 |
兼容性最佳。 |
|
数据盘 |
ESSD云盘 |
容量需大于 |
网络要求
-
ECS与Hologres实例需在同一VPC网络。
-
确保ECS可以通过VPC域名访问Hologres实例。
下载并配置测试工具
步骤1:登录ECS实例
通过SSH登录ECS实例,详情请参见使用OpenSSH/Xshell远程连接Linux实例。
步骤2:安装PSQL客户端
执行以下命令安装PostgreSQL客户端工具:
sudo yum install postgresql
步骤3:下载并解压测试工具包
下载benchmark_online_tpcds.tar.gz测试工具包,上传至ECS后,执行以下命令解压并进入工作目录:
tar xvf benchmark_online_tpcds.tar.gz
cd holo_tpcds
步骤4:配置Hologres连接信息
编辑holo.conf文件,填写Hologres实例的连接信息:
vim holo.conf
配置参数说明如下:
|
参数 |
说明 |
示例 |
|
host |
Hologres实例的VPC网络域名(不含端口)。 |
hgxxx-xx-xxx-vpc-st.hologres.aliyuncs.com |
|
port |
Hologres实例的VPC网络端口。 |
80 |
|
user |
账号的AccessKey ID。 |
LTAI5txxx |
|
password |
账号的AccessKey Secret。 |
- |
|
db |
数据库名(留空则自动为tpcds_{scale_factor}g)。 |
- |
|
scale_factor |
数据集比例因子,控制生成数据量大小,默认1000,单位GB。 |
1000 |
执行测试
方式一:一键全流程执行(首次测试推荐)
包含生成数据、建表、导入数据、执行查询的完整流程:
nohup ./tpcds_benchmark/run.sh -f > test.log 2>&1 &
方式二:仅执行查询测试(数据已导入时使用)
当数据已导入Hologres,仅需重新执行查询测试时使用:
nohup ./tpcds_benchmark/run.sh > test.log 2>&1 &
全流程测试步骤说明
|
步骤 |
说明 |
对应脚本 |
|
1. 生成数据 |
使用dsdgen工具生成TPC-DS标准数据集。 |
bin/gen_data.sh |
|
2. 建表 |
在Hologres中创建24张表及表属性。 |
bin/create_tables.sh |
|
3. 导入数据 |
将生成的数据并行COPY到Hologres。 |
bin/load_data.sh |
|
4. 执行查询 |
执行99条TPC-DS查询并记录耗时。 |
bin/run_queries.sh |
查询执行模式说明
-
采用Hot Run模式(长连接),避免连接建立开销。
-
每条查询执行4次:第1次为预热(不计时),第2-4次计时取平均值。
-
执行前会清空缓存(freecache),确保测试公平性。
查看测试结果
测试结果概览
run.sh执行完成后会输出各步骤耗时统计:
[INFO] Step Summary:
gen_data: 1537619 ms
create_tables: 6457 ms
load_data: 4013581 ms
run_queries: 1102638 ms
all_steps: 6660303 ms
benchmark_total: 243168 ms
各指标说明如下:
|
指标 |
说明 |
|
gen_data |
数据生成耗时(毫秒)。 |
|
create_tables |
建表耗时(毫秒)。 |
|
load_data |
数据导入耗时(毫秒)。 |
|
run_queries |
查询执行总耗时(毫秒),包含预热。 |
|
all_steps |
所有步骤总耗时(毫秒)。 |
|
benchmark_total |
99条查询的有效执行总耗时(毫秒),即性能基准值。 |
查询明细结果
每条查询的平均执行时间会输出到日志中,格式如下:
q1 45.32
q2 67.89
...
q99 123.45
total 5432.10
注意事项
-
为减少可能对测试结果有影响的变量,建议每次新建实例进行测试,请勿使用升配或降配后的实例。
-
确保ECS磁盘空间充足:生成1 TB数据需要至少1.5 TB可用磁盘空间。
-
测试期间请勿在同一Hologres实例上运行其他业务负载,以免影响测试结果准确性。
-
建议使用nohup后台运行测试脚本,避免SSH断连导致测试中断。
-
如需多次测试对比,请确保每次测试使用相同的实例规格和数据规模。
已知限制
-
Hologres实例版本必须为Hologres V4.2及以上,低版本不支持本测试工具包中使用的部分语法和功能。
-
实例计算资源必须为64 core及以上,低于该规格可能导致查询超时或内存不足。
-
scale_factor默认为1000(对应1 TB数据),如需调整数据规模,请修改holo.conf中的scale_factor参数。 -
数据生成阶段依赖本地磁盘空间,生成数据量约为
scale_factor对应的GB数,请确保磁盘空间充足。 -
使用
-f参数执行时会删除并重建数据库,已有同名数据库中的数据将被清除,请谨慎操作。 -
本测试工具包中的查询结果仅供性能参考,不构成TPC-DS官方基准测试结果。