Hologres external tables

更新时间:
复制 MD 格式

This topic explains how to create a Hologres external table by specifying the following details in a CREATE TABLE statement: the Hologres data source, an authorization method (either STS authentication or a service-linked role), the target table, and JDBC driver information.

Overview

Hologres is a real-time interactive analytical data warehouse that is compatible with the PostgreSQL protocol and seamlessly integrates with MaxCompute. You can create Hologres external tables in MaxCompute to query a Hologres data source using the PostgreSQL JDBC driver and STS authentication. This method eliminates redundant storage, requires no data imports or exports, and delivers fast query results.

Limitations

  • Data manipulation (DML)

    • MaxCompute does not support UPDATE or DELETE operations on Hologres external tables.

    • Hologres external tables do not support the INSERT OVERWRITE statement.

      To overwrite data in a Hologres table with data from MaxCompute, run an INSERT OVERWRITE statement in Hologres. This statement reads data from an external table in Hologres that maps to a MaxCompute table. For more information, see INSERT OVERWRITE.

  • Data write risk

    When you write large volumes of data to a Hologres external table, parallel write processes are used. In rare cases, one of these processes might rewrite data, causing data duplication.

  • Partitions

    Partitioned tables in Hologres and MaxCompute do not map directly to each other. Hologres external tables do not support partitions. However, in direct-read mode, when you query a Hologres external table that maps to a Hologres partitioned parent table, partition pruning applies if the query filter includes a condition on the partition column of an underlying partitioned child table.

  • Hologres external tables do not support the cluster attribute.

  • Hologres external tables do not support mapping to a Hologres Dynamic Table.

  • See the list of Supported data types.

Create a Hologres external table

You can create a Hologres external table using either the RAM role authorization mode (STS mode) or the service-linked role authorization mode (double-signature mode).

Considerations

Guidelines for creating external tables

  • If your Hologres instance uses an IP whitelist for external access, upgrade it to a supported version:

    • For the RAM role authorization mode, upgrade to Hologres V3.2 or later.

    • For the service-linked role authorization mode, upgrade to Hologres V2.0.28 or later.

    After the upgrade, Hologres automatically trusts MaxCompute node IPs based on the MaxCompute service identity or the service-linked role. You no longer need to add the IP addresses of MaxCompute elastic nodes to the IP whitelist of your Hologres instance.

  • You can map both parent tables and child tables in Hologres to an external table. However, you can only read from, not write to, an external table that maps to a parent table.

  • When you write data to a Hologres external table, the INSERT ON CONFLICT (UPSERT) mechanism is not supported. If the Hologres source table has a primary key, avoid writing data that causes primary key conflicts.

  • Table names and column names are case-insensitive. You cannot enforce case sensitivity when you create or query tables and columns.

Schema compatibility

If the Hologres source table schema does not match the external table schema, note the following points:

  • Mismatched column count: If the source table has fewer columns than the external table, reading data fails with an error such as column "xxx" does not exist. If the source table has more columns than the external table, the extra columns are ignored.

  • Mismatched column types: MaxCompute cannot map a STRING column from a Hologres source table to an INT column in the external table. Mapping an INT column to a STRING column is supported but not recommended.

Syntax

When you create an external table, you must specify the StorageHandler in the DDL statement, and configure the JDBC connection string and STS authentication information (or enable the service-linked role authorization mode) to access the Hologres data source.

RAM role mode

For a complete code sample, see Create a Hologres external table (RAM role authorization mode).

CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name>(
  <col1_name> <data_type>,
  <col2_name> <data_type>,
  ......
)
stored BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
WITH serdeproperties (
  'odps.properties.rolearn'='<ram_arn>')
location '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>' 
tblproperties (
  'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver', 
  'odps.federation.jdbc.target.db.type'='holo',
  'odps.federation.jdbc.colmapping'='<table_column1>:<source_column1>, <table_column2>:<source_column2>,...'
);

Service-linked role mode

For a complete code sample, see Create a Hologres external table (service-linked role authorization mode).

The service-linked role authorization mode allows the same RAM user to access authorized tables in both MaxCompute and Hologres without requiring manual authorization. This mode also supports the Hologres IP whitelist feature to simplify usage.

-- Create an external table
CREATE EXTERNAL TABLE [IF NOT EXISTS]  <table_name>(
  <col1_name> <data_type>,
  <col2_name> <data_type>,
  ......
)
STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
LOCATION '<jdbc:postgresql://<endpoint>:<port>/<database>?ApplicationName=MaxCompute&[currentSchema=<schema>&][useSSL={true|false}&]table=<holo_table_name>/>' 
tblproperties (
  'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver', 
  'odps.federation.jdbc.target.db.type'='holo',
  ['odps.federation.jdbc.colmapping'='<table_column1>:<source_column1>, <table_column2>:<source_column2>,...']
);

Parameters

Click to view parameter details

Parameter

Required

Description

IF NOT EXISTS

No

  • If you do not specify the IF NOT EXISTS option and a table with the same name exists, an error is reported.

  • If you specify IF NOT EXISTS, the statement succeeds regardless of whether a table with the same name exists. The existing table's metadata is not modified, even if its schema differs from the one you are creating.

table_name

Yes

The name of the Hologres external table to create in MaxCompute.

col_name

Yes

The name of a column in the Hologres external table.

data_type

Yes

The data type of a column in the Hologres external table.

STORED BY

Yes

Specifies the StorageHandler that defines how to query the Hologres external table.

Set the value to com.aliyun.odps.jdbc.JdbcStorageHandler to use the JDBC connection method.

ram_arn

Yes

This parameter is required when you create a Hologres external table in STS mode. It specifies the ARN of the RAM role for STS authentication. For more information about how to create a RAM role and grant permissions, see Hologres RAM role authorization.

  1. 登录RAM控制台

  2. 在左侧导航栏选择Identities > Roles

  3. Roles页面,单击目标Role Name,进入该角色详情页。

  4. Basic Information区域,可以获取ARN信息。

No

This parameter is not required in double-signature mode.

LOCATION

Yes

The JDBC connection string for the Hologres instance. It contains the following parameters:

  • endpoint: Required. The classic network domain name of the Hologres instance.

    Important

    Currently, you can connect to Hologres only over the classic network. Connections over a VPC are not supported.

  • port: Required. The network port of the Hologres instance.

    1. 登录Hologres管理控制台,在左上角选择地域。

    2. 在左侧导航栏选择Instances

      Instances页面,单击目标实例名称。

    3. Network Information区域获取Hologres实例的经典网络域名和网络端口。

  • database: Required. The name of the target Hologres database. For more information, see CREATE DATABASE.

  • ApplicationName: Required. The default value is MaxCompute. You do not need to change this value.

  • schema: Optional. You can omit this parameter if the table name is unique within the database or if the source table is in the default schema. For more information, see CREATE SCHEMA.

  • holo_table_name: Required. The name of the Hologres source table. For more information, see CREATE TABLE.

tblproperties

Yes

  • mcfed.mapreduce.jdbc.driver.class: Required.

    Specifies the driver for connecting to the Hologres database. The value must be org.postgresql.Driver.

  • odps.federation.jdbc.target.db.type: Required.

    Specifies the database type. The value must be holo.

  • odps.federation.jdbc.colmapping: Optional.

    Specifies the mapping between the columns of the external table and the columns of the Hologres source table.

    • If this parameter is not configured, columns are mapped by name between the source and external tables.

    • If you configure this parameter for only some columns, the specified columns are mapped as defined, while the remaining columns are mapped by name. An error occurs if the names or data types of these automatically mapped columns do not match.

    • If a Hologres column name contains uppercase letters, enclose the column name in double quotation marks (""). Format: MaxCompute_column_1:"Hologres_column_1"[,MaxCompute_column_2:"Hologres_column_2",...].

      Note

      Assume that the Hologres source table contains the columns c bool, map_B string, a bigint, and the external table contains the columns a bigint, x string, c bool.

      If the colmapping configuration is set to 'x: "map_B"', you can successfully map and query Hologres data.

  • mcfed.mapreduce.jdbc.input.query: Optional.

    Reads data from a Hologres data source table. The columns, column names, and data types of the external table must match those of the source table, or the alias if one is used. The format of select_sentence is SELECT xxx FROM <holo_database_name>.<holo_schema_name>.<holo_table_name>.

Examples

Step 1: Prepare Hologres data

You can skip this step if you have a Hologres database, table, and test data.

Create a Hologres database

  1. 登录Hologres管理控制台,在左上角选择地域。

  2. 在左侧导航栏选择Instances

  3. If you do not have an instance, you must first purchase a Hologres instance.

    Instances页面,单击目标实例名称。

  4. 在实例详情页面,单击Connect to Instance

  5. 单击上方Metadata Management页签。

    单击Create Database,在弹出的对话框中,填写Database Name,其他参数保持默认。

Create a Hologres table

  1. 在实例详情页面,单击Connect to Instance

  2. 单击上方SQL Editor页签。

  3. Run the following statements to create a table and insert data:

    CREATE TABLE IF NOT EXISTS holo (
        id   INT PRIMARY KEY,
        name TEXT
    );
    INSERT INTO holo (id, name) VALUES
        (1, 'kate'),
        (2, 'mary'),
        (3, 'bob'),
        (4, 'tom'),
        (5, 'lulu'),
        (6, 'mark'),
        (7, 'haward'),
        (8, 'lilei'),
        (9, 'hanmeimei'),
        (10, 'lily'),
        (11, 'lucy');
    SELECT * FROM holo ORDER BY id;

Step 2: Create a Hologres external table

RAM role

In this mode, the creator of the external table embeds a RAM role with Hologres data access permissions in the external table definition. You can then grant permissions on this external table to other MaxCompute users as needed. When a user accesses the external table, the system uses the bound RAM role to access data in the mapped Hologres table.

Hologres does not support adding a cross-account RAM role to a database instance. Therefore, Hologres in STS mode supports only RAM roles from the same account. Similarly, when MaxCompute accesses Hologres using an external table or external schema, only RAM roles from the same account are supported.

Prerequisites

  1. You have installed and configured the MaxCompute client.

  2. You have a target MaxCompute project where you want to create the Hologres external table.

    For more information, see Create MaxCompute project.

  3. You have created and authorized a RAM role. For more information, see Authorize a RAM Role for Hologres.

Create the external table in MaxCompute

  1. Log on to the MaxCompute client and switch to your target MaxCompute project.

  2. Run the following statement to create the Hologres external table.

    The following example uses the data prepared in Step 1: Prepare Hologres data:

    • Hologres database name: holo_external_test.

    • Hologres database schema: public.

    • Hologres table name: holo.

    • Hologres classic network endpoint: hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80.

    CREATE EXTERNAL TABLE IF NOT EXISTS my_table_holo_jdbc
    (
     id bigint,
     name string
    )
    stored BY 'com.aliyun.odps.jdbc.JdbcStorageHandler' 
    WITH serdeproperties (
      'odps.properties.rolearn'='acs:ram::139699392458****:role/<role name>')
    location 'jdbc:postgresql://hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80/<holo database name>?ApplicationName=MaxCompute&currentSchema=public&useSSL=true&table=<table name>/'
    tblproperties (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo',
      'odps.federation.jdbc.colmapping'='id:id,name:name'
    );
  3. Run the following commands to query the Hologres source table by using the newly created external table.

    -- The following properties are required to access Hologres external tables.
    SET odps.sql.split.hive.bridge=true;
    SET odps.sql.hive.compatible=true;
    SET odps.table.api.enable.holo.table=true; --Enable JDBC direct read access 
    -- Query data from the Hologres external table.
    SELECT * FROM my_table_holo_jdbc limit 10;
    -- The command returns the following result.
    +------------+------------+
    | id         | name       | 
    +------------+------------+
    | 9          | hanmeimei  | 
    | 4          | tom        | 
    | 7          | haward     | 
    | 2          | mary       | 
    | 5          | lulu       | 
    | 8          | lilei      | 
    | 10         | lily       | 
    | 1          | kate       | 
    | 6          | mark       | 
    | 11         | lucy       | 
    +------------+------------+
  4. Use the Hologres external table to exchange data with Hologres and perform federated analysis.

    • Write data processed in MaxCompute to Hologres using the external table for accelerated analytics and online services.

      -- The following properties are required to access Hologres external tables.
      SET odps.sql.split.hive.bridge=true;
      SET odps.sql.hive.compatible=true;
      SET odps.table.api.enable.holo.table=true; --Enable JDBC direct read access 
      -- Insert data into the Hologres external table.
      INSERT INTO my_table_holo_jdbc VALUES (12,'alice');
      -- Query data from the Hologres external table.
      SELECT * FROM my_table_holo_jdbc;
      -- The command returns the following result.
      +------------+------------+
      | id         | name       | 
      +------------+------------+
      | 9          | hanmeimei  | 
      | 4          | tom        | 
      | 7          | haward     | 
      | 2          | mary       | 
      | 5          | lulu       | 
      | 12         | alice      | 
      | 8          | lilei      | 
      | 10         | lily       | 
      | 1          | kate       | 
      | 11         | lucy       | 
      | 6          | mark       | 
      | 3          | bob        | 
      +------------+------------+
    • Store frequently updated dimension tables in Hologres to support real-time updates. MaxCompute accesses the dimension table using an external table and joins it with a fact table in MaxCompute for federated analysis.

      -- The following properties are required to access Hologres external tables.
      SET odps.sql.split.hive.bridge=true;
      SET odps.sql.hive.compatible=true;
      -- Create a MaxCompute internal table.
      CREATE TABLE holo_test AS SELECT * FROM my_table_holo_jdbc;
      -- Join the MaxCompute internal table with the Hologres external table for analysis.
      SELECT * FROM my_table_holo_jdbc t1 INNER JOIN holo_test t2 ON t1.id=t2.id;
      -- The command returns the following result.
      +------------+------------+------------+------------+
      | id         | name       | id2        | name2      | 
      +------------+------------+------------+------------+
      | 9          | hanmeimei  | 9          | hanmeimei  | 
      | 4          | tom        | 4          | tom        | 
      | 7          | haward     | 7          | haward     | 
      | 2          | mary       | 2          | mary       | 
      | 5          | lulu       | 5          | lulu       | 
      | 12         | alice      | 12         | alice      | 
      | 8          | lilei      | 8          | lilei      | 
      | 10         | lily       | 10         | lily       | 
      | 1          | kate       | 1          | kate       | 
      | 11         | lucy       | 11         | lucy       | 
      | 6          | mark       | 6          | mark       | 
      | 3          | bob        | 3          | bob        | 
      +------------+------------+------------+------------+

Service-linked role

In this mode, you must create a service-linked role to access Hologres data. To prevent unauthorized access, this mode forwards the user's identity to Hologres for authentication. Therefore, the user must have permissions on both the external table and the underlying Hologres table.

Prerequisites

  1. You have installed and configured the MaxCompute client.

  2. You have a target MaxCompute project where you want to create the Hologres external table.

    For more information, see Create MaxCompute project.

  3. An account with the same name as your MaxCompute account exists in Hologres, and this account has read and write permissions on the target table.

  4. This service-linked role authorization mode is supported only in Hologres V1.3 and later. Currently, this mode supports only reading from, not writing to, Hologres external tables.

Create the external table in MaxCompute

  1. Log on to the MaxCompute client and switch to your target MaxCompute project.

  2. Run the following statement to create the Hologres external table.

    The following example uses the data prepared in Step 1: Prepare Hologres data:

    • Hologres database name: holo_external_test.

    • Hologres database schema: public.

    • Hologres table name: holo.

    • Hologres classic network endpoint: hgprecn-cn-oew210ut****-cn-hangzhou-internal.hologres.aliyuncs.com:80.

    -- Create an external table.
    CREATE EXTERNAL TABLE IF NOT EXISTS holo_mc_external_dbl
    (
      id int,
      name string
    )
    STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
    location 'jdbc:postgresql://hgpostcn-cn-****-cn-hangzhou-internal.hologres.aliyuncs.com:80/<holo database name>?ApplicationName=MaxCompute&currentSchema=public&useSSL=false&table=<table name>/'
    TBLPROPERTIES (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo',
      'odps.federation.jdbc.colmapping'='id:id,name:name'
    );
  3. Query the external table.

    SELECT * FROM holo_mc_external_dbl;
    -- The command returns the following result.
    +------------+------------+
    | id         | name       | 
    +------------+------------+
    | 9          | hanmeimei  | 
    | 4          | tom        | 
    | 7          | haward     | 
    | 2          | mary       | 
    | 5          | lulu       | 
    | 12         | alice      | 
    | 8          | lilei      | 
    | 10         | lily       | 
    | 1          | kate       | 
    | 11         | lucy       | 
    | 6          | mark       | 
    | 3          | bob        | 
    +------------+------------+

Enable direct read for Hologres external tables

By default, MaxCompute reads data from Hologres external tables through a JDBC connection. To improve performance, you can enable a direct read mode that allows MaxCompute to directly access the Hologres storage layer. This mode provides the following benefits:

  • Reduces read latency for faster query performance.

  • Minimizes the number of connections to the Hologres frontend. Most queries require only a single connection.

Limitations

When you enable the direct read mode for Hologres, the following limitations apply. If these conditions are not met, the query falls back to JDBC mode.

  1. Version requirements

    Your Hologres instance must be V1.3.34 or later. The direct read feature is not supported in earlier versions.

  2. Table type restrictions

    • Direct read mode is not supported for Hologres cold storage tables.

    • Direct read mode is not supported for Hologres row-oriented tables.

  3. Data type mapping limitations

    When you use the direct read mode, mapping a Timestamp With Time Zone type in Hologres to a TIMESTAMP type in MaxCompute can cause minor time discrepancies. The differences are as follows:

    • Time value discrepancy

      • For timestamps before 1900-12-31 15:54:15, the time returned by MaxCompute is ahead by 5 minutes and 44 seconds.

      • For timestamps between 1900-12-31 15:54:16 and 1969-12-31 23:59:58, the time returned by MaxCompute is ahead by 1 second.

      • For timestamps after 1969-12-31 23:59:59, the times in Hologres and MaxCompute are identical.

    • Time zone offset

      • For example, if the MaxCompute time zone is UTC+8 and a Timestamp With Time Zone column in Hologres is 2000-01-01 00:00:00, MaxCompute returns 2000-01-01 08:00:00.

      • For example, if the MaxCompute time zone is UTC+8 and a Timestamp With Time Zone column in Hologres is 1969-01-01 00:00:00, MaxCompute returns 1969-01-01 08:00:01.

  4. Same-region restriction

    Due to network connectivity constraints, MaxCompute can access Hologres instances only in the same region. Cross-region access fails with the following error: FAILED: ODPS-0010000:System internal error - fuxi job failed, caused by: Pangu request failed with error code 3.

  5. If your Hologres instance uses a primary/secondary architecture, you can specify only the primary instance in the connection URL, not a secondary instance.

  6. Additional limitation for Foreign Server mode: You must enable the schema-level syntax switch for the MaxCompute project.

Enable direct read mode

To enable the direct read mode for your session, add the following command before your SQL query in MaxCompute:

SET odps.table.api.enable.holo.table=true;

You can also enable or disable direct read mode at the project level.

-- You can enable direct read and disable JDBC fallback at the project level.
-- Enable direct read at the project level: 
setproject odps.table.api.enable.holo.table=true; -- Set to true to enable, false to disable.
-- Disable the default fallback to JDBC: 
setproject odps.table.api.allow.fallback.jdbc=false; -- Set to true to allow fallback, false to prevent it.

Verify direct read mode

To verify that a query used the direct read mode, check the logs in Logview. For more information about how to use Logview, see Use Logview 2.0 to view job running information.

On the Summary tab in Logview, find the external holo tables field. The format is as follows:

<project_name>.<table_name>:<access mode>[<(fallback reason)>]

Parameters:

Parameter

Description

project_name

The name of the project.

table_name

The name of the table.

access mode

The access mode for the external table. Valid values:

  • Optimized: Indicates that the direct read mode is used. The following is a Logview example. Optimized: Indicates that the direct read optimization mode is enabled. In the Summary section of Logview, external holo tables: <project_name>.<table_name>:Optimized is displayed, which indicates that the external Holo table has used the direct read optimization path.

  • Fallback: Indicates that the system has reverted to JDBC mode. The following is an example from Logview. In the Summary tab, if the external holo tables section displays a Fallback message, such as aliorc_test.holo_lineitem:Fallback (Column type map error Column name l_orderkey Odps type 3 Holo type 4), this indicates that direct-read mode was not used. Instead, the system reverted to Fallback mode due to a column type mapping error.

fallback reason

If the access mode is Fallback, this parameter explains why the query reverted to JDBC mode. Possible reasons and their solutions include:

  • Column type map error Column name ${ColumnName}: The data type of a column in the MaxCompute table is incompatible with the corresponding column in the Hologres table. To resolve this, modify the data type in the external table to match the data type mapping.

  • Holo connection error: The connection to the Hologres instance failed. Possible causes include incorrect permissions or an invalid instance status. Ensure that the current user has the required permissions to access the Hologres database and the Hologres instance is running.

  • Odps table is partition table: Direct read mode is not supported for partitioned external tables.

  • Select hg_version error, Hologres version check error, or Fetch hg_version data error: The Hologres instance version is incorrect. To resolve this, upgrade the Hologres instance to V1.3.34 or later. For more information, see Upgrade an instance.

When a job using direct read mode encounters a limitation, it automatically reverts to JDBC mode. Under heavy workloads, this fallback can consume significant resources from the Hologres connection pool, and data transfer in JDBC mode is much less efficient than in direct read mode. Consequently, these resources may be released slowly, increasing the load on your Hologres instance.

In extreme cases, frequent fallbacks can impact other business-critical services that rely on the same Hologres instance. To prevent this, you can disable the automatic fallback mechanism by adding the set odps.table.api.allow.fallback.jdbc=false; parameter to your jobs. This causes jobs to fail instead of silently falling back, which prevents unexpected performance degradation.

Predicate pushdown for Hologres external tables

When you query a Hologres external table in MaxCompute, you can push down the query's filter conditions to Hologres. This technique, known as predicate pushdown (PPD), filters data at the source, which reduces the amount of data transferred and improves query performance. Predicate pushdown is especially effective in the following scenarios:

  • Queries with selective filter conditions that filter out large amounts of data.

  • Queries on logical partitioned tables.

Notes

Scenario

Description

Core parameter

Valid values for the predicate_pushdown_mode parameter:

  • off: Disables predicate pushdown and reads all data. You can use this setting for debugging and comparison in a development environment.

  • on: Enables predicate pushdown to create only filtered snapshots. Recommended for production environments.

  • Debug: Creates two snapshots to compare performance. This can be used for performance testing and troubleshooting.

Example syntax:

set odps.external.holo.read.table.rules="{predicate_pushdown_mode:on}";

Supported types

INT, BIGINT, STRING, DATE, TIMESTAMP, BINARY

Limitations

  • Predicate pushdown does not support user-defined functions (UDFs) or built-in functions.

  • Dropping CAST expressions during implicit conversion is not currently supported.

  • The holob type is not currently supported.

  • You cannot use predicate pushdown with primary key tables in Hologres.

Applicable scenarios

  • Hologres instances that support PPD.

Examples

  1. Prepare the data. In your Hologres instance, create a test table and insert data.

    CREATE TABLE test_multi_type_ppd (
      c_id INT NOT NULL,
      c_boolean BOOLEAN,
      c_int INT,
      c_bigint BIGINT,
      c_double DOUBLE precision,
      c_float REAL,
      c_string TEXT,
      c_date DATE,
      c_timestamp_tz TIMESTAMPTZ,
      c_timestamp_ntz TIMESTAMP
    )
    WITH (SEGMENT_KEY = 'c_id');
    -- Insert 10 rows of data.
    INSERT INTO test_multi_type_ppd VALUES
    (1,  true,   10,  1000,   1.25,  1.5,  'alpha',   '2023-01-01', '2023-01-01 10:00:00+08', '2023-01-01 02:00:00'),
    (2,  false,  20,  2000,  -2.50,  2.5,  'beta',    '2023-01-02', '2023-01-02 10:00:00+08', '2023-01-02 02:00:00'),
    (3,  true,   30,  3000,   3.75, -3.5,  'gamma',   '2023-01-03', '2023-01-03 10:00:00+08', '2023-01-03 02:00:00'),
    (4,  false,  40,  4000,   4.00,  4.5,  'delta',   '2023-01-04', '2023-01-04 10:00:00+08', '2023-01-04 02:00:00'),
    (5,  true,   50,  5000,   5.50,  5.5,  'epsilon', '2023-01-05', '2023-01-05 10:00:00+08', '2023-01-05 02:00:00'),
    (6,  false,  60,  6000,   6.60,  6.0,  'zeta',    '2023-01-06', '2023-01-06 10:00:00+08', '2023-01-06 02:00:00'),
    (7,  true,   70,  7000,   7.70, -7.0,  'eta',     '2023-01-07', '2023-01-07 10:00:00+08', '2023-01-07 02:00:00'),
    (8,  false,  80,  8000,   8.80,  8.0,  'theta',   '2023-01-08', '2023-01-08 10:00:00+08', '2023-01-08 02:00:00'),
    (9,  true,   90,  9000,  -9.90,  9.0,  'iota',    '2023-01-09', '2023-01-09 10:00:00+08', '2023-01-09 02:00:00'),
    (10, false, 100, 10000,  10.10, 10.0,  'kappa',   '2023-01-10', '2023-01-10 10:00:00+08', '2023-01-10 02:00:00');
    SELECT * FROM test_multi_type_ppd;
  2. Create a Hologres external table in MaxCompute.

    CREATE EXTERNAL TABLE test_multi_type_ppd
    (
      c_id int,
      c_boolean boolean,
      c_int int,
      c_bigint bigint,
      c_double double,
      c_float float,
      c_string string,
      c_date date,
      c_timestamp_tz timestamp,
      c_timestamp_ntz timestamp_ntz
    )
    STORED BY 'com.aliyun.odps.jdbc.JdbcStorageHandler'
    LOCATION 'jdbc:postgresql://<holo-endpoint>/<db>?currentSchema=public&table=test_multi_type_ppd'
    TBLPROPERTIES (
      'mcfed.mapreduce.jdbc.driver.class'='org.postgresql.Driver',
      'odps.federation.jdbc.target.db.type'='holo'
    );
  3. Query the external table.

    -- Enable Hologres direct read.
    SET odps.table.api.enable.holo.table=true;
    -- Set the PPD mode (off, on, or debug).
    SET odps.external.holo.read.table.rules="{predicate_pushdown_mode:on}";
    -- Configure the type system.
    SET odps.sql.type.system.odps2=true;
    SET odps.sql.decimal.odps2=true;
    -- Allow expressions to be pushed down to TableScan. This is typically enabled by default.
    SET odps.use.builtin.function.input.range=true;
    -- set odps.mcqa.disable=true;
    SELECT * FROM test_multi_type_ppd WHERE c_id = 3;
    -- The query returns the following result.
    +------+-----------+-------+------------+------------+---------+----------+--------+----------------+-----------------+
    | c_id | c_boolean | c_int | c_bigint   | c_double   | c_float | c_string | c_date | c_timestamp_tz | c_timestamp_ntz |
    +------+-----------+-------+------------+------------+---------+----------+--------+----------------+-----------------+
    | 3    | true      | 30    | 3000       | 3.75       | -3.5    | gamma    | 2023-01-03 | 2023-01-03 10:00:00 | 2023-01-03 02:00:00 |
    +------+-----------+-------+------------+------------+---------+----------+--------+----------------+-----------------+
    -- Check the Summary section in Logview. The following information is displayed:
    external holo tables:
    	**.default.test_multi_type_ppd:Optimized
      Holo-side PPD enabled -->   Filter pushdown applied successfully
      Snapshot Id: 56-1-100***
      With-filter file count: 2
      With-filter create snapshot duration: 19ms

Hologres data import and export

  • To import data from Hologres into a MaxCompute table or partition, use LOAD.

  • To export data from a MaxCompute project to Hologres for use by other compute engines, use UNLOAD.

Supported data types

  • Decimal type limitations

    For Hologres external tables created in MaxCompute, the DECIMAL data type is fixed at decimal(38,18) and cannot be modified. If the source table has fewer decimal places, you can define the column as the STRING type in the MaxCompute external table and then use the CAST function to explicitly cast the data when you use it.

  • Complex type limitations

    Hologres external tables created in MaxCompute do not support complex data types such as Array, Map, or Struct.

  • Other incompatible types

    Data types in Hologres, such as MONEY, are not supported because there are no corresponding data types in MaxCompute.

The following table lists the data type mappings between Hologres and MaxCompute.

Hologres type

MaxCompute type

JDBC read

JDBC write

Direct read (odps.table.api.enable.holo.table=true)

Description

INTEGER (alias: INT or INT4)

INT

Supported

Supported

Supported

A 32-bit signed integer.

TEXT

STRING, VARCHAR

Supported

Supported

Supported

A string type with a length limit of 8 MB.

SMALLINT

SMALLINT

Supported

Supported

Supported

A 16-bit signed integer.

INT2

SMALLINT

Supported

Supported

Supported

A 16-bit signed integer.

BIGINT (alias: INT8)

BIGINT

Supported

Supported

Supported

A 64-bit signed integer.

BYTEA

BINARY

Supported

Supported

Supported

The binary data type currently has a length limit of 8 MB. The Hologres BINARY data type has a maximum size of 1 GB.

REAL (alias: FLOAT4)

FLOAT

Supported

Supported

Supported

A 32-bit binary floating-point number.

DOUBLE PRECISION (alias: FLOAT8)

DOUBLE

Supported

Supported

Supported

A 64-bit binary floating-point number.

BOOLEAN (alias: BOOL)

BOOLEAN

Supported

Supported

Supported

A BOOLEAN type.

TIMESTAMP

TIMESTAMP_NTZ

Supported

Not supported

Supported

The Hologres TIMESTAMP type has microsecond precision and no time zone.

It maps to MaxCompute's TIMESTAMP_NTZ, which has nanosecond precision. The difference in precision may cause discrepancies.

TIMESTAMP WITH TIME ZONE (alias: TIMESTAMPTZ)

TIMESTAMP

Supported

Not supported

Supported

The timestamp data type is accurate to the nanosecond and uses the format yyyy-mm-dd hh:mm:ss.xxxxxxxxx.

The underlying precision conversion is handled automatically.

DECIMAL (alias: NUMERIC)

DECIMAL(precision,scale)

Supported

Supported

Supported

A base-10 exact numeric type.

  • precision: Specifies the maximum number of digits. Value range: 1 <= precision <= 38.

  • scale: Specifies the number of decimal places. Value range: 0 <= scale <= 18.

In MaxCompute, the DECIMAL data type defaults to (38,18) if no precision is specified. When you create a table by using the IMPORT FOREIGN SCHEMA statement, the system automatically converts the precision.

CHAR(n)

CHAR(n)

Supported

Supported

Supported

In MaxCompute, CHAR(n) is a fixed-length character type. n specifies the length. The maximum length is 255. Shorter strings are padded with spaces.

The maximum size of the CHAR(n) data type in Hologres is 1 GB.

VARCHAR(n)

VARCHAR(n)

Supported

Supported

Supported

The VARCHAR(n) data type in MaxCompute is a variable-length character type. The n parameter specifies the length, which ranges from 1 to 65535.

In Hologres, the maximum size for VARCHAR(n) is 1 GB.

DATE

DATE

Supported

Supported

Supported

The data type is date in the yyyy-mm-dd format.

INT4[]

ARRAY<INT>

Supported

Not supported

Supported

Hologres arrays are mapped to the corresponding MaxCompute ARRAY types. Note that writing arrays to Hologres is not supported via JDBC.

INT8[]

ARRAY<BIGINT>

Supported

Not supported

Supported

FLOAT4[]

ARRAY<FLOAT>

Supported

Not supported

Supported

FLOAT8[]

ARRAY<DOUBLE>

Supported

Not supported

Supported

BOOLEAN[]

ARRAY<BOOLEAN>

Supported

Not supported

Supported

TEXT[]

ARRAY<STRING>

Supported

Not supported

Supported

JSONB

JSON

Not supported

Not supported

Supported

Hologres JSONB is mapped to MaxCompute's JSON type. This mapping is only supported in direct read mode.

JSON

STRING

Supported

Not supported

Not supported

When using JDBC, the Hologres JSON type is mapped to MaxCompute's STRING type, which limits the size to 8 MB.

SERIAL (auto-increment)

INT

Supported

Not supported

Supported

MaxCompute reads Hologres SERIAL fields as INT values. The auto-increment behavior is not supported on writes.

RoaringBitmap

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

RoaringBitmap64

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

BIT(n)

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

VARBIT(n)

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

INTERVAL

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

TIMETZ

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

TIME

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

INET

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

MONEY

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

OID

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

UUID

Not supported

Not supported

Not supported

Not supported

No corresponding type exists in MaxCompute.

FAQ

ODPS-0130071 error when directly reading Hologres data

  • Problem

    When you read data from Hologres in direct read mode, the job fails with an error similar to ODPS-0130071 Failed to split to equal size...max count: 7777. For example:

    ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed: storage/table/src/input_splits_builder.cpp(195): StorageException: Failed to split to equal size, total size: 2143570729934, min size: 268435456, max size: 272629760, max count: 7777, split size: 275629513, split count: 7777
  • Cause

    When MaxCompute reads data from Hologres, its default data splitting strategy can cause the number of mapper instances to exceed the 7,777 concurrency limit.

    Note

    This limit prevents jobs with an excessive number of mappers from running, which protects the stability of Hologres files and network connections.

  • Solution

    You can resolve this error by setting the following parameters.

    SET odps.external.holo.mapper.instances=10000; -- Increase the concurrency limit. The maximum value is 10,000.
    SET odps.sql.mapper.split.size=512; -- Adjust the job concurrency. The maximum value is 512 MB.

Slow SQL jobs on Hologres external tables

  • Problem

    Performance is slow when you use the SDK to search for data in a MaxCompute external table.

  • Solution

    External tables perform slowly because they only support full table scans. For better performance, use a MaxCompute internal table instead.

Error when a column name is a keyword

  • Problem

    If a column name in your Hologres table is a keyword, queries may fail with the following errors without special configuration:

    ODPS-0123131:User defined function exception - SQLException in nextKeyValue

    Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near ","

  • Solution

    Add the odps.federation.jdbc.colmapping parameter to map the columns of the Hologres source table to the columns of the Hologres external table.

    For example, if the Hologres source table contains the keyword "offset" as a column name, add the 'odps.federation.jdbc.colmapping'='offset:"offset"' parameter when you create the Hologres external table.

The password authentication failed for user "mcslr$STS.******" error in double-signature mode

  • Problem

    When a RAM user or RAM role reads a Hologres external table by using the service-linked role authorization mode (double-signature mode), the job may fail with the password authentication failed for user "mcslr$STS.******" error. The following provides an error message example:

    ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed: splitByCommonTable() failed, dataSourceId=[TableScan1], tableId=[*****.*****], location=[jdbc:postgresql://hgpostcn-cn-*****-cn-beijing-internal.hologres.aliyuncs.com:80/*****?ApplicationName=MaxCompute&currentSchema=public&useSSL=false&table=*****/], rootCause -> common/table/jni/utils/jni_helper.cpp(126): UnretryableException: Common table io jni exception - ExceptionType: com.aliyun.odps.hive.wrapper.thrift.RequestException
    Message: null
    StackTrace:
    RequestException(reason:java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "mcslr$STS.*********"
  • Solution

    This error indicates that the user lacks the required read permissions for the table in Hologres. To resolve this, log on to the Hologres console with your Alibaba Cloud account or an administrator account. In the console, go to Security Center, add the user to User Management, and grant the required database permissions.

The Holo-side PPD disabled --> The Holo snapshot PPD mode is enabled, but FAILED to push down the filter into Holo error in direct read mode

  • Error message

    Holo-side PPD disabled --> The Holo snapshot PPD mode is enabled, 
    but FAILED to push down the filter into Holo, 
    errorMsg[storage/table/src/holo/snapshot.cpp(319): 
    StorageException: Create/Get snapshot failed for table: test_multi_type_ppd partition:  
    Query:select holo_create_table_snapshot('public.test_multi_type_ppd', 259200, filter=>$$c_id = 3$$) with error ERROR:  
    internal error: Failed to build fragment dag desc: node type 65 is not supported
  • Solution

    Ensure that your Hologres internal table meets the requirements in Usage notes.

References

For more information about common issues with Hologres foreign tables, see Lakehouse and Foreign Table FAQ.