文档

导入数据

更新时间:

本文介绍如何将TPC-H 100 GB测试数据分别导入AnalyticDB MySQL和Presto中。

下表列出了TPC-H测试数据集中的表数据条数。

表名

数据条数

customer

15000000

lineitem

600037902

nation

25

orders

150000000

part

20000000

partsupp

80000000

region

5

supplier

1000000

AnalyticDB MySQL导入数据

  1. 使用LOAD DATA将dbgen生成的文件导入至AnalyticDB MySQL中。

    LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'orders.tbl'   INTO TABLE ORDERS   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'lineitem.tbl' INTO TABLE LINEITEM 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'nation.tbl'   INTO TABLE NATION   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'partsupp.tbl' INTO TABLE PARTSUPP 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL NFILE 'part.tbl'     INTO TABLE PART     
    FIELDS TERMINATED BY '|'   LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'region.tbl'   INTO TABLE REGION   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE SUPPLIER 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    说明

    您也可以通过OSS外表将测试数据导入至AnalyticDB MySQL中,详情请参见通过外表导入至数仓版

  2. 收集统计信息。

    ANALYZE TABLE customer UPDATE HISTOGRAM;
    ANALYZE TABLE lineitem UPDATE HISTOGRAM;
    ANALYZE TABLE nation UPDATE HISTOGRAM;
    ANALYZE TABLE orders UPDATE HISTOGRAM;
    ANALYZE TABLE part UPDATE HISTOGRAM;
    ANALYZE TABLE partsupp UPDATE HISTOGRAM;
    ANALYZE TABLE region UPDATE HISTOGRAM;
    ANALYZE TABLE supplier UPDATE HISTOGRAM;
    说明

    查询优化器(Query Optimizer)将查询转换为执行计划,并交给执行引擎执行。执行计划的质量会影响查询的性能。统计信息作为查询优化器的输入,可以帮助查询优化器生成高质量的执行计划。因此,在导入数据后,您需收集所有表的直方图信息,以获得最佳性能。若您想了解统计信息的更多内容,请参见统计信息

Presto导入数据

您可以通过外表将测试数据导入至Presto中,示例如下:

  • CUSTOMER表

    INSERT overwrite TABLE customer
    SELECT
        c_custkey,
            c_name,
            c_address,
            c_nationkey,
            c_phone,
            c_acctbal,
            c_mktsegment,
            c_comment
    FROM ${source_db}.customer;
  • LINEITEM表

    INSERT overwrite TABLE lineitem
    SELECT
        l_orderkey,
        l_partkey,
        l_suppkey,
        l_linenumber,
        l_quantity,
        l_extendedprice,
        l_discount,
        l_tax,
        l_returnflag,
        l_linestatus,
        l_shipdate,
        l_commitdate,
        l_receiptdate,
        l_shipinstruct,
        l_shipmode,
        l_comment
    FROM ${source_db}.lineitem;
  • NATION表

    INSERT overwrite TABLE nation
    SELECT
            n_nationkey,
            n_name,
            n_regionkey,
            n_comment
    FROM ${source_db}.nation;
  • ORDERS表

    INSERT overwrite TABLE orders
    SELECT
            o_orderkey,
            o_custkey,
            o_orderstatus,
            o_totalprice,
            o_orderdate,
            o_orderpriority,
            o_clerk,
            o_shippriority,
            o_comment
    FROM ${source_db}.orders;
  • PART表

    INSERT overwrite TABLE part
    SELECT
            p_partkey,
            p_name,
            p_mfgr,
            p_brand,
            p_type,
            p_size,
            p_container,
            p_retailprice,
            p_comment
    FROM ${source_db}.part;
  • PARTSUPP表

    INSERT overwrite TABLE partsupp
    SELECT
            ps_partkey,
            ps_suppkey,
            ps_availqty,
            ps_supplycost,
            ps_comment
    FROM ${source_db}.partsupp;
  • REGION表

    INSERT overwrite TABLE region
    SELECT
            r_regionkey,
            r_name,
            r_comment
    FROM ${source_db}.region;
  • SUPPLIER表

    INSERT overwrite TABLE supplier
    SELECT
        s_suppkey,
        s_name,
        s_address,
        s_nationkey,
        s_phone,
        s_acctbal,
        s_comment
    FROM ${source_db}.supplier;

  • 本页导读 (1)
文档反馈