本文介绍如何将TPC-H 100 GB测试数据分别导入AnalyticDB MySQL和Presto中。
下表列出了TPC-H测试数据集中的表数据条数。
表名 | 数据条数 |
customer | 15000000 |
lineitem | 600037902 |
nation | 25 |
orders | 150000000 |
part | 20000000 |
partsupp | 80000000 |
region | 5 |
supplier | 1000000 |
AnalyticDB MySQL导入数据
使用LOAD DATA将dbgen生成的文件导入至AnalyticDB for MySQL中。
LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'orders.tbl' INTO TABLE ORDERS FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'lineitem.tbl' INTO TABLE LINEITEM FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'nation.tbl' INTO TABLE NATION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'partsupp.tbl' INTO TABLE PARTSUPP FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'part.tbl' INTO TABLE PART FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'region.tbl' INTO TABLE REGION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE SUPPLIER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
说明您也可以通过OSS外表将测试数据导入至AnalyticDB for MySQL中,详情请参见通过外表导入至数仓版。
收集统计信息。
ANALYZE TABLE customer UPDATE HISTOGRAM; ANALYZE TABLE lineitem UPDATE HISTOGRAM; ANALYZE TABLE nation UPDATE HISTOGRAM; ANALYZE TABLE orders UPDATE HISTOGRAM; ANALYZE TABLE part UPDATE HISTOGRAM; ANALYZE TABLE partsupp UPDATE HISTOGRAM; ANALYZE TABLE region UPDATE HISTOGRAM; ANALYZE TABLE supplier UPDATE HISTOGRAM;
说明查询优化器(Query Optimizer)将查询转换为执行计划,并交给执行引擎执行。执行计划的质量会影响查询的性能。统计信息作为查询优化器的输入,可以帮助查询优化器生成高质量的执行计划。因此,在导入数据后,您需收集所有表的直方图信息,以获得最佳性能。若您想了解统计信息的更多内容,请参见统计信息。
Presto导入数据
您可以通过外表将测试数据导入至Presto中,示例如下:
CUSTOMER表
INSERT overwrite TABLE customer SELECT c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment FROM ${source_db}.customer;
LINEITEM表
INSERT overwrite TABLE lineitem SELECT l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment FROM ${source_db}.lineitem;
NATION表
INSERT overwrite TABLE nation SELECT n_nationkey, n_name, n_regionkey, n_comment FROM ${source_db}.nation;
ORDERS表
INSERT overwrite TABLE orders SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment FROM ${source_db}.orders;
PART表
INSERT overwrite TABLE part SELECT p_partkey, p_name, p_mfgr, p_brand, p_type, p_size, p_container, p_retailprice, p_comment FROM ${source_db}.part;
PARTSUPP表
INSERT overwrite TABLE partsupp SELECT ps_partkey, ps_suppkey, ps_availqty, ps_supplycost, ps_comment FROM ${source_db}.partsupp;
REGION表
INSERT overwrite TABLE region SELECT r_regionkey, r_name, r_comment FROM ${source_db}.region;
SUPPLIER表
INSERT overwrite TABLE supplier SELECT s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment FROM ${source_db}.supplier;
反馈
- 本页导读 (1)
文档反馈