本文介绍如何将TPC-H 100GB测试数据分别导入AnalyticDB MySQL、Presto、Spark中。
下表列出了TPC-H测试数据集中的表数据条数。
表名 | 数据条数 |
---|---|
customer | 15000000 |
lineitem | 600037902 |
nation | 25 |
orders | 150000000 |
part | 20000000 |
partsupp | 80000000 |
region | 5 |
supplier | 1000000 |
AnalyticDB MySQL
- 在AnalyticDB MySQL中,使用LOAD DATA导入dbgen生成的文件。
LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'orders.tbl' INTO TABLE ORDERS FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'lineitem.tbl' INTO TABLE LINEITEM FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'nation.tbl' INTO TABLE NATION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'partsupp.tbl' INTO TABLE PARTSUPP FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL NFILE 'part.tbl' INTO TABLE PART FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'region.tbl' INTO TABLE REGION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE SUPPLIER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
- 在AnalyticDB MySQL中,还可以通过OSS外表方式导入测试数据,请参见通过外表导入OSS数据。
Presto/Spark
在Presto、Spark中,可以通过外表方式导入测试数据。
- CUSTOMER表
insert overwrite table customer select c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment from ${source_db}.customer;
- LINEITEM表
insert overwrite table lineitem select l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment from ${source_db}.lineitem;
- NATION表
insert overwrite table nation select n_nationkey, n_name, n_regionkey, n_comment from ${source_db}.nation;
- ORDERS表
insert overwrite table orders select o_orderkey, o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment from ${source_db}.orders;
- PART表
insert overwrite table part select p_partkey, p_name, p_mfgr, p_brand, p_type, p_size, p_container, p_retailprice, p_comment from ${source_db}.part;
- PARTSUPP表
insert overwrite table partsupp select ps_partkey, ps_suppkey, ps_availqty, ps_supplycost, ps_comment from ${source_db}.partsupp;
- REGION表
insert overwrite table region select r_regionkey, r_name, r_comment from ${source_db}.region;
- SUPPLIER表
insert overwrite table supplier select s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment from ${source_db}.supplier;
在文档使用中是否遇到以下问题
更多建议
匿名提交