Hologres external projects

更新时间:
复制 MD 格式

MaxCompute external projects provide real-time access to metadata and data by mapping to a Hologres database, enabling both queries and data insertion. When accessing Hologres, MaxCompute passes the current task executor's identity to Hologres for authentication and authorization. This approach allows you to map an entire real-time data warehouse at the database level, integrating real-time analytics with offline computing.

Supported data types

  • Decimal type limitations

    For Hologres external tables created in MaxCompute, the DECIMAL data type is fixed at decimal(38,18) and cannot be modified. If the source table has fewer decimal places, you can define the column as the STRING type in the MaxCompute external table and then use the CAST function to explicitly cast the data when you use it.

  • Complex type limitations

    Hologres external tables created in MaxCompute do not support complex data types such as Array, Map, or Struct.

  • Other incompatible types

    Data types in Hologres, such as MONEY, are not supported because there are no corresponding data types in MaxCompute.

For more information, see Hologres and MaxCompute data type mapping.

Authorization

Grant permissions to RAM users

If you are operating as a RAM user, ensure the following access policies are attached. For instructions on granting permissions, see Manage permissions for RAM users.

  • AliyunMaxComputeFullAccess: Use this policy to create external data sources and external projects.

  • AliyunHologresReadOnlyAccess: Use this policy to create Hologres external projects.

Permissions

  • Control plane permissions: Permissions required for operations in the MaxCompute console, such as creating a MaxCompute external project and binding it to a Hologres instance.

    Manage these permissions through RAM access control. To configure them, log in to the RAM console with your Alibaba Cloud account. For more information, see Manage permissions for RAM users.

  • Data plane permissions: Permissions required to read from or write to Hologres tables within the external project. These permissions apply when you use SQL statements in MaxCompute to manage resources, such as schemas and tables, in the bound Hologres instance.

    Manage these permissions in the Hologres console. To configure them, log in with your Alibaba Cloud account. For more information, see Grant permissions to RAM users.

Create a Hologres external data source

  1. Activate Hologres service, and MaxCompute and Hologres must be deployed in the same region.

  2. Log in to the MaxCompute console and select a region in the upper-left corner.

  3. In the left-side navigation pane, choose Manage Configurations > External Data Source .

  4. On the External Data Source page, click Create External Data Source.

  5. In the Create External Data Source dialog box, configure the parameters. The following tables describe the parameters.

    Parameter

    Required

    Description

    External Data Source Type

    Yes

    Select Hologres.

    External Data Source Name

    Yes

    The name of the external data source. The name must meet the following requirements:

    • It must start with a letter and can contain only lowercase letters, digits, and underscores (_).

    • It must be 128 characters or less.

    Example: paimon_dlf.

    Description

    No

    The description of the external data source.

    Connection Method

    Yes

    The default value is classic network access (internal network).

    instance ID

    Yes

    Select the Hologres instance in the current region to connect to.

    Host

    Yes

    The system automatically populates this parameter.

    Port

    Yes

    The system automatically populates this parameter.

    DBNAME

    Yes

    The name of the Hologres database to connect to.

    Authentication and authorization

    Yes

    • RAM Role: Select this method to use the external data source with Hologres external schemas.

    • Task executor: External data sources for Hologres external projects must use the Task executor authentication mode.

    role ARN

    Yes

    The Alibaba Cloud Resource Name (ARN) of the RAM role. To create a RAM role and grant permissions, see Grant permissions by using a standard RAM role for Hologres.

    1. Log in to the RAM console.

    2. In the left navigation bar, select Identities > Roles.

    3. In the Basic Information section, you can obtain the ARN.

    Example: acs:ram::124****:role/aliyunodpsdefaultrole.

    Service-linked Role

    Yes

    If you select Task executor, the system automatically sets the Service-linked Role to acs:ram::124****:role/aliyunserviceroleformaxcomputeidentitymgmt.

    Foreign Server Supplemental Properties

    No

    Supplemental properties that define how tasks access the external data source.

    Note

    For information about supported parameters, see the official documentation. More parameters will be added as product capabilities evolve.

  6. Click OK to create the external data source.

  7. On the External Data Source page, find the data source and click Details in the Actions column to view its details.

Create a Hologres external project

  1. Log in to the MaxCompute console and select a region in the upper-left corner.

  2. In the left-side navigation pane, choose Manage Configurations > Projects.

  3. On the External Project tab, click Create Project.

  4. In the Create Project dialog box, configure the settings and click OK.

    Parameter

    Required

    Description

    Project Type

    Yes

    Defaults to external project.

    Region

    Yes

    The current region is selected by default and cannot be changed.

    Project Name (Globally Unique)

    Yes

    The name must be 3 to 28 characters long, start with a letter, and contain only letters, digits, and underscores (_).

    MaxCompute Foreign Server Type

    No

    Select Hologres.

    MaxCompute Foreign Server

    No

    • Use Existing: Displays a list of existing external data sources.

    • Create Foreign Server: Lets you create a new external data source.

    MaxCompute Foreign Server Name

    Yes

    • If you select Select Existing, choose an external data source from the drop-down list.

    • If you select Create New External Data Source, the name of the new external data source is automatically used.

    Billing Method

    Yes

    Valid values are Subscription and Pay-as-you-go.

    Default Quota

    Yes

    Select an existing quota.

    Description

    No

    Enter a description for the project.

Access a data source using SQL

Important

Deleting an external project does not delete the underlying data because the project is only a mapping to the data source.

However, unlike with standard external tables, running a DROP TABLE or DROP SCHEMA command in an external project sends the request to the peer service. This permanently deletes the corresponding table or database. Use DROP operations with caution.

  1. Choose a connection tool to log on to the external project.

    The following example shows how to use an SQL node for data development in a DataWorks workspace (new version).

    1. Log in to the DataWorks console and select a region in the upper-left corner.

    2. In the navigation pane on the left, choose Workspace.

    3. On the Workspaces page, click Create Workspace or click the name of an existing workspace.

    4. On the Workspace Details page, click Computing Resource in the navigation pane on the left.

    5. On the Computing Resource page, click Associate Computing Resources, and select MaxCompute.

    6. Configure the Basic Information for Associate MaxCompute Computing Resource.

      For the MaxCompute project, select the external project.

  2. List the schemas in the external project.

    -- Enable session-level schema syntax.
    SET odps.namespace.schema=true;
    SHOW schemas;
    
    -- Output:
    ID = 20250922****g27aqqcnyn2
    hg_internal
    hg_recyclebin
    public
    
    
    OK
  3. List the tables in a schema.

    -- <schema_name> is the name of a schema returned by the `SHOW schemas` command.
    USE schema <schema_name>;
    SHOW tables;
  4. Create a schema in the external project.

    -- In this example, <schema_name> is schema_demo_test.
    CREATE schema <schema_name>;
  5. Use the new schema.

    USE schema <schema_name>;
  6. Create a table in the schema and insert data into it.

    • Command format:

      -- Create a table.
      CREATE TABLE [IF NOT EXISTS] <table_name> 
      (
        <col_name> <data_type>,
        ...
      )
      [COMMENT <table_comment>]
      [PARTITIONED BY (<col_name> <data_type>, ...)] 
      ;
      
      -- Insert data. The INSERT OVERWRITE operation is not supported.
      INSERT INTO <table_name> [PARTITION (<pt_spec>)] [(<col_name> [,<col_name> ...)]]
      <select_statement>
      FROM <from_statement>
    • Example:

      CREATE TABLE schema_table(id int,name string);
      
      INSERT INTO schema_table VALUES (101,'Zhang San'),(102,'Li Si');
      
      -- Query the schema_table table.
      SELECT * FROM schema_table;
      
      -- Output:
      +------------+------------+
      | id         | name       | 
      +------------+------------+
      | 101        | Zhang San  | 
      | 102        | Li Si      | 
      +------------+------------+
  7. Use the existing public schema.

    use schema public;
    show tables;
    
    -- Output:
    ID = 20250922065846301g50iovzbx8g
    1221220769279014	test1
    
    
    OK
    
    -- Read data from the table.
    SELECT * FROM test1;
    
    -- Output:
    +------------+------------+
    | id         | name       | 
    +------------+------------+
    | 101        | Zhang San  | 
    | 102        | Li Si      | 
    +------------+------------+
    
    -- Write data to the table and then query it to verify the write operation.
    INSERT INTO test1 VALUES (103,'Test');
    
    SELECT * FROM test1;
    
    -- Output:
    +------------+------------+
    | id         | name       | 
    +------------+------------+
    | 102        | Li Si      | 
    | 103        | Test       | 
    | 101        | Zhang San  | 
    +------------+------------+

Hologres and MaxCompute data type mapping

Hologres type

MaxCompute type

JDBC read

JDBC write

Direct read

(odps.table.api.enable.holo.table=true)

Description

INTEGER (INT or INT4)

INT

Supported

Supported

Supported

A 32-bit signed integer.

TEXT

STRING, VARCHAR

Supported

Supported

Supported

A string type. The current length limit is 8 MB.

SMALLINT

SMALLINT

Supported

Supported

Supported

A 16-bit signed integer.

INT2

SMALLINT

Supported

Supported

Supported

A 16-bit signed integer.

BIGINT (INT8)

BIGINT

Supported

Supported

Supported

A 64-bit signed integer.

BYTEA

BINARY

Supported

Supported

Supported

  • A binary data type. The current length limit is 8 MB.

  • The BYTEA data type in Hologres supports a maximum size of 1 GB.

REAL (FLOAT4)

FLOAT

Supported

Supported

Supported

A 32-bit binary floating-point type.

DOUBLE PRECISION (FLOAT8)

DOUBLE

Supported

Supported

Supported

A 64-bit binary floating-point type.

BOOLEAN (BOOL)

BOOLEAN

Supported

Supported

Supported

A logical Boolean data type (TRUE or FALSE).

TIMESTAMP

TIMESTAMP_NTZ

Not Supported

Not Supported

Not Supported

The TIMESTAMP_NTZ data type in MaxCompute stores values with nanosecond precision but may have time zone discrepancies. Hologres's TIMESTAMP data type stores values with microsecond precision and does not include time zone information.

TIMESTAMP WITH TIME ZONE (TIMESTAMPTZ)

TIMESTAMP

Not Supported

Not Supported

Not Supported

A timestamp type with nanosecond precision. The format is yyyy-mm-dd hh:mm:ss.xxxxxxxxx. MaxCompute and Hologres perform precision conversion at the underlying layer. The output from MaxCompute does not include a time zone format.

DECIMAL (NUMERIC)

DECIMAL(precision,scale)

Supported

Supported

Supported

A decimal exact numeric type.

  • precision: Specifies the maximum number of digits that can be represented. Value range: 1 <= precision <= 38.

  • scale: Specifies the number of digits in the fractional part. Default value range: 0 <= scale <= 18.

In MaxCompute, the default precision for the DECIMAL data type is (38,18) if no precision is specified. When you create a table by using the IMPORT FOREIGN SCHEMA statement, the system automatically converts the precision.

CHAR(n)

CHAR(n)

Supported

Supported

Supported

The CHAR(n) data type in MaxCompute is a fixed-length character type where n is the length (up to 255). Shorter strings are space-padded to the specified length.

In Hologres, the CHAR(n) data type supports a maximum size of 1 GB.

VARCHAR(n)

VARCHAR(n)

Supported

Supported

Supported

The VARCHAR(n) data type in MaxCompute is a variable-length character type where n specifies the length, ranging from 1 to 65,535.

In Hologres, the VARCHAR(n) data type supports a maximum size of 1 GB.

DATE

DATE

Supported

Supported

Supported

This is a date type. The format is yyyy-mm-dd.

INT4[]

ARRAY<INT>

Not Supported

Not Supported

Not Supported

The ARRAY complex type.

INT8[]

ARRAY<BIGINT>

Not Supported

Not Supported

Not Supported

FLOAT4[]

ARRAY<FLOAT>

Not Supported

Not Supported

Not Supported

FLOAT8[]

ARRAY<DOUBLE>

Not Supported

Not Supported

Not Supported

BOOLEAN[]

ARRAY<BOOLEAN>

Not Supported

Not Supported

Not Supported

TEXT[]

ARRAY<STRING>

Not Supported

Not Supported

Not Supported

JSONB

JSON

Not Supported

Not Supported

Not Supported

The JSON complex type.

JSON

STRING

Not Supported

Not Supported

Not Supported

A string type. The current length limit is 8 MB.

SERIAL (auto-increment sequence)

INT

Not Supported

Not Supported

Not Supported

An auto-increment sequence field in Hologres.

FAQ

Problem: A RAM user fails to run show schemas

Symptom: When a RAM user runs the show schemas command using a task executor identity, the system returns the following error: password authentication failed for user "mcslr$STS.NZTtDfsd2VBifvYMSzXEDgBLf

Cause: The RAM user lacks the required permissions to perform operations in the Hologres database.

Solution:

  1. Log on to the Hologres console and select a region from the upper-left corner.

  2. In the left-side navigation pane, click Instances. On the Instances page, click the target instance ID to open its details page, and then click Connect to Instance.

  3. In the top navigation bar, click Security Center. In the left-side navigation pane, click Users. On the User Management page, click Add User in the upper-right corner. In the dialog box that appears, add the RAM user.

  4. In the top navigation bar, click Security Center. In the left-side navigation pane, click Database Authorization. Select User Authorization, and then grant the RAM user the required database permissions based on your business requirements. For more information about specific permissions, see RAM User Authorization Quick Start.