Accelerate MaxCompute queries with foreign tables

更新时间:
复制 MD 格式

Hologres uses foreign tables to accelerate queries on MaxCompute data. This approach lets you directly access and analyze data in MaxCompute, improving query efficiency and simplifying the data processing workflow.

Permissions

To accelerate queries on MaxCompute data, you must grant users the required permissions to access MaxCompute projects and tables. For more information, see Manage user permissions by using commands.

Data type mapping

MaxCompute data types map one-to-one to Hologres data types. When you create a table, see Data type mapping between MaxCompute and Hologres.

Methods

Method

Use cases

Key features

Accelerate queries on MaxCompute data with CREATE FOREIGN TABLE

Ideal for accelerating a small number of tables, querying a subset of columns, or working with stable table structures.

Manually create tables with flexible column and comment definitions.

Accelerate queries on MaxCompute data with IMPORT FOREIGN SCHEMA

Map all tables at the schema or database level.

Automatically synchronize table structures for an entire schema.

Accelerate queries on MaxCompute data with Auto Load

A large number of tables, or tables with frequent structural changes (such as adding, deleting, or modifying columns).

Automatically detects source table changes and supports both on-demand and full loads.

Accelerate queries with CREATE FOREIGN TABLE

Use the CREATE FOREIGN TABLE statement to create MaxCompute foreign tables. This statement lets you customize table names, select specific columns, and define comments. This section provides examples of how to query data from non-partitioned and partitioned MaxCompute tables by using CREATE FOREIGN TABLE.

Alternatively, use HoloWeb to create a table visually. For more information, see Create a MaxCompute foreign table in HoloWeb.

Example 1: Query a non-partitioned table

  1. Create a non-partitioned table in MaxCompute and import data. This example uses the customer table from the BIGDATA_PUBLIC_DATASET.tpcds_10t public dataset in MaxCompute as sample data.

    View the DDL statement for the sample table

    -- DDL for the table in the MaxCompute public dataset
    CREATE TABLE IF NOT EXISTS public_data.customer(
      c_customer_sk BIGINT,
      c_customer_id STRING,
      c_current_cdemo_sk BIGINT,
      c_current_hdemo_sk BIGINT,
      c_current_addr_sk BIGINT,
      c_first_shipto_date_sk BIGINT,
      c_first_sales_date_sk BIGINT,
      c_salutation STRING,
      c_first_name STRING,
      c_last_name STRING,
      c_preferred_cust_flag STRING,
      c_birth_day BIGINT,
      c_birth_month BIGINT,
      c_birth_year BIGINT,
      c_birth_country STRING,
      c_login STRING,
      c_email_address STRING,
      c_last_review_date_sk STRING);

    Run the following command to view the data in the sample table.

    -- Query the table in MaxCompute to check for data.
    SET odps.namespace.schema=true;
    SELECT * FROM BIGDATA_PUBLIC_DATASET.tpcds_10t.customer;
  2. In Hologres, create a foreign table that maps to the MaxCompute source table. The following is a sample statement:

    SET hg_enable_convert_type_for_foreign_table = true;
    CREATE FOREIGN TABLE customer (
        "c_customer_sk" int8,
        "c_customer_id" text,
        "c_current_cdemo_sk" int8,
        "c_current_hdemo_sk" int8,
        "c_current_addr_sk" int8,
        "c_first_shipto_date_sk" int8,
        "c_first_sales_date_sk" int8,
        "c_salutation" text,
        "c_first_name" text,
        "c_last_name" text,
        "c_preferred_cust_flag" text,
        "c_birth_day" int8,
        "c_birth_month" int8,
        "c_birth_year" int8,
        "c_birth_country" text,
        "c_login" text,
        "c_email_address" text,
        "c_last_review_date_sk" text)
    SERVER odps_server
    OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET.tpcds_10t', table_name 'customer');

    The following table describes the parameters.

    Parameter

    Description

    SERVER

    The foreign server.

    You can use odps_server, which is a foreign server pre-configured in Hologres. For more information about the principles, see Postgres FDW.

    project_name

    • If your MaxCompute project uses the three-layer model, project_name specifies the MaxCompute project name and schema name in the format odps_project_name.odps_schema_name.

    • If your MaxCompute project uses a two-layer model, project_name is the MaxCompute project name.

    For more information about the three-layer model, see Schema operations.

    table_name

    The name of the MaxCompute table that you want to query.

  3. After creating the foreign table, query it in Hologres to retrieve data from MaxCompute. The following is a sample statement:

    SELECT * FROM customer LIMIT 10;
    Important

    If a query error occurs, make sure that the account used to run the query has the required permissions, such as the Select permission, on the MaxCompute table. For more information, see Permissions.

Example 2: Query a partitioned table

  1. Prepare a partitioned table in MaxCompute and import data. This example uses the ods_enterprise_share_trade_h table from the BIGDATA_PUBLIC_DATASET.finance public dataset in MaxCompute as sample data.

    View the DDL statement for the sample table

    -- DDL for the table in the public dataset
    CREATE TABLE IF NOT EXISTS public_data.ods_enterprise_share_trade_h(
      code STRING COMMENT 'Code',
      name STRING COMMENT 'Name',
      industry STRING COMMENT 'Industry',
      area STRING COMMENT 'Area',
      pe STRING COMMENT 'PE ratio',
      outstanding STRING COMMENT 'Outstanding shares',
      totals STRING COMMENT 'Total shares (in 10,000s)',
      totalassets STRING COMMENT 'Total assets (in 10,000s)',
      liquidassets STRING COMMENT 'Liquid assets',
      fixedassets STRING COMMENT 'Fixed assets',
      reserved STRING COMMENT 'Reserve fund',
      reservedpershare STRING COMMENT 'Reserve per share',
      eps STRING COMMENT 'EPS',
      bvps STRING COMMENT 'BVPS',
      pb STRING COMMENT 'PB ratio',
      timetomarket STRING COMMENT 'Time to market',
      undp STRING COMMENT 'Undistributed profit',
      perundp STRING COMMENT 'Undistributed profit per share',
      rev STRING COMMENT 'YoY revenue (%)',
      profit STRING COMMENT 'YoY profit (%)',
      gpr STRING COMMENT 'Gross profit margin (%)',
      npr STRING COMMENT 'Net profit margin (%)',
      holders_num STRING COMMENT 'Number of shareholders') 
      PARTITIONED BY (ds STRING) STORED AS ALIORC TBLPROPERTIES ('comment'='Data import date');

    Run the following command to view the data in the sample table.

    -- Query data from a specific partition in MaxCompute.
    SET odps.namespace.schema=true;
    SELECT * FROM BIGDATA_PUBLIC_DATASET.finance.ods_enterprise_share_trade_h WHERE ds = '20170113';
  2. In Hologres, create a foreign table that maps to the partitioned MaxCompute table. The following is a sample statement:

    CREATE FOREIGN TABLE public.foreign_ods_enterprise_share_trade_h (
        "code" text,
        "name" text,
        "industry" text,
        "area" text,
        "pe" text,
        "outstanding" text,
        "totals" text,
        "totalassets" text,
        "liquidassets" text,
        "fixedassets" text,
        "reserved" text,
        "reservedpershare" text,
        "eps" text,
        "bvps" text,
        "pb" text,
        "timetomarket" text,
        "undp" text,
        "perundp" text,
        "rev" text,
        "profit" text,
        "gpr" text,
        "npr" text,
        "holders_num" text,
        "ds" text
    )
    SERVER odps_server
    OPTIONS (project_name 'BIGDATA_PUBLIC_DATASET#finance', table_name 'ods_enterprise_share_trade_h');
    comment on foreign table public.foreign_ods_enterprise_share_trade_h is 'Historical stock trading information';
    comment on column public.foreign_ods_enterprise_share_trade_h."code" is 'Stock symbol';
    comment on column public.foreign_ods_enterprise_share_trade_h."name" is 'Name';
    comment on column public.foreign_ods_enterprise_share_trade_h."industry" is 'Industry';
    comment on column public.foreign_ods_enterprise_share_trade_h."area" is 'Region';
    comment on column public.foreign_ods_enterprise_share_trade_h."pe" is 'Price-to-earnings ratio';
    comment on column public.foreign_ods_enterprise_share_trade_h."outstanding" is 'Outstanding shares';
    comment on column public.foreign_ods_enterprise_share_trade_h."totals" is 'Total shares (in 10,000s)';
    comment on column public.foreign_ods_enterprise_share_trade_h."totalassets" is 'Total assets (in 10,000s)';
    comment on column public.foreign_ods_enterprise_share_trade_h."liquidassets" is 'Liquid assets';
    comment on column public.foreign_ods_enterprise_share_trade_h."fixedassets" is 'Fixed assets';
    comment on column public.foreign_ods_enterprise_share_trade_h."reserved" is 'Reserve fund';
    comment on column public.foreign_ods_enterprise_share_trade_h."reservedpershare" is 'Reserve per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."eps" is 'Earnings per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."bvps" is 'Book value per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."pb" is 'Price-to-book ratio';
    comment on column public.foreign_ods_enterprise_share_trade_h."timetomarket" is 'Time to market';
    comment on column public.foreign_ods_enterprise_share_trade_h."undp" is 'Undistributed profit';
    comment on column public.foreign_ods_enterprise_share_trade_h."perundp" is 'Undistributed profit per share';
    comment on column public.foreign_ods_enterprise_share_trade_h."rev" is 'YoY revenue (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."profit" is 'YoY profit (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."gpr" is 'Gross profit margin (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."npr" is 'Net profit margin (%)';
    comment on column public.foreign_ods_enterprise_share_trade_h."holders_num" is 'Number of shareholders';
  3. Query the partitioned MaxCompute table from Hologres.

    • Query the first 10 rows of data. The following is a sample SQL statement:

      SELECT * FROM foreign_ods_enterprise_share_trade_h limit 10;
    • Query data in a specific partition. The following is a sample SQL statement:

      SELECT * FROM foreign_ods_enterprise_share_trade_h 
      WHERE ds = '20170113';
    Important

    If a query error occurs, make sure that the account used to run the query has the required permissions, such as the Select permission, on the MaxCompute table. For more information, see Permissions.

Accelerate queries with IMPORT FOREIGN SCHEMA

To create multiple MaxCompute foreign tables, use the IMPORT FOREIGN SCHEMA statement. For more information, see IMPORT FOREIGN SCHEMA.

Accelerate queries with Auto Load

If you need to accelerate a large number of foreign tables, or if the schemas of the source tables in MaxCompute change frequently (for example, columns are deleted, reordered, or their data types are modified), you can use the Auto Load feature. This feature automatically loads data from MaxCompute with on-demand or full loads. It eliminates the need to manually alter the foreign table structure, thereby improving query efficiency. For more information, see Auto Load for foreign tables.