Map Hologres metadata to MaxCompute external tables

更新时间:
复制 MD 格式

To access Hologres data from MaxCompute, use the metadata mapping feature in the data catalog to create MaxCompute external tables. This method allows you to read Hologres data directly without importing it, saving compute and storage resources while enabling flexible and efficient data processing.

Background

MaxCompute allows you to query and analyze data in external systems like Hologres by using external tables. With a PostgreSQL JDBC driver and RAM role authorization, you can query external data directly without importing it into MaxCompute.

DataWorks simplifies this process through metadata mapping, which leverages the following MaxCompute capabilities:

Prerequisites

Access control

image

DataWorks determines the access identity based on the source of the MaxCompute project and verifies permissions accordingly.

Access identity source

MaxCompute permission check

Hologres permission check

  • If the MaxCompute project is from a DataWorks data source: The access identity is the one configured for the data source. To map Hologres metadata to a MaxCompute project that corresponds to a production data source, you must have the O&M or Workspace Administrator role in DataWorks.

  • If the MaxCompute project is one you are authorized to access: The access identity is your current account.

  • When you map Hologres metadata to MaxCompute, ensure that the access identity has been added to the MaxCompute project.

  • When you read Hologres data using a MaxCompute external table, ensure that the access identity has read permissions on the MaxCompute external table.

  • With dual-signature authentication, the access identity must have read and write permissions on the Hologres table.

  • With RAM role-based authentication, the RAM role must have read and write permissions on the Hologres table.

Limits

  • Only data in Hologres internal databases can be mapped to MaxCompute.

  • For limits on using Hologres external tables in MaxCompute, see Hologres external tables.

  • MaxCompute and Hologres use different data types. Some Hologres types cannot be mapped and are automatically skipped during the mapping process. Carefully review the Data type mapping between MaxCompute and Hologres.

Entry point

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose Shortcuts > Data Studio in the Actions column.

  2. In the left navigation bar, click image to go to Data Directory.

  3. In the Hologres directory, locate the schema or table that requires metadata mapping to MaxCompute, right-click it, and select Map Metadata to MaxCompute.

Schema-level metadata mapping

This feature maps the metadata of a Hologres schema to an external schema in MaxCompute. Before you start, we recommend that you read the Data Lakehouse 2.0 User Guide to learn about MaxCompute external data sources and external schemas.

Important

You can map metadata only to an internal MaxCompute project that has the schema feature enabled.

Prepare a Hologres external data source

Schema-level metadata mapping works by creating a MaxCompute external schema that synchronizes metadata from an external data source. Therefore, you must first create a MaxCompute external data source in DataWorks that points to the target Hologres database. You must then associate this data source with the specified MaxCompute project to establish the mapping.

For more information about how to create a MaxCompute external data source, see Data Lakehouse 2.0 User Guide.

Note

When you create the external data source, enter the classic network address for Host. VPC addresses are not supported. For DB, specify the Hologres database whose metadata you want to map.

DataWorks: Configure Hologres schema-level metadata mapping

  1. Go to the metadata mapping configuration page.

  2. In the Hologres project, find the schema that you want to map to MaxCompute. Right-click the schema name and select Map Metadata to MaxCompute to open the metadata mapping configuration page.

  3. Configure the parameters for schema-level mapping.

    • Hologres (source)

      Parameter

      Description

      Source Object Type

      The type of object to map to MaxCompute. This is fixed to Hologres Schema.

      Source Object Name

      The name of the Hologres schema to map to MaxCompute. This is fixed to the currently selected schema.

      Format: <hologres_database>.<hologres_schema>.

      Note

      You must create a MaxCompute external data source of the Hologres type, and set the default Hologres database for the external data source to <hologres_database>. For more information, see MaxCompute: Prepare a Hologres-type external data source.

    • MaxCompute (destination)

      Parameter

      Description

      Project Find Method

      Select how to find the MaxCompute project.

      • From DataWorks data source: Select the corresponding MaxCompute instance by selecting the MaxCompute data source that is bound to the current workspace. After you select this method, you also need to select a Data Sources and set an External schema name to create an external schema in the target MaxCompute project.

        Note

        Only users with the O&M or Workspace Administrator role can select production data sources.

      • I have permission: Select a target MaxCompute project for mapping from the MaxCompute projects that you are authorized to access under your current Alibaba Cloud primary account. If you select this option, you also need to set the External schema name to specify the target Schema.

      Data Sources

      When Project Find Method is set to From DataWorks data source, you need to manually select the MaxCompute data source.

      Project Name

      When Project Find Method is set to I have permission, you need to manually select a MaxCompute project.

      External schema name

      Specify the name for the MaxCompute external schema to which the metadata from the source Hologres schema is mapped.

      External Data Source

      Select the MaxCompute external data source that is already connected to the source Hologres database.

      Important

      When you create a MaxCompute external data source:

      • For Host, enter the classic network address. VPC addresses are not supported.

      • For Authentication Method, only the RAM role method is supported. The ExecuteWithUserAuth method is not supported.

      Auth

      Automatically populated based on the selected external data source.

      Host:port

      Database

  4. Click the Run button on the top toolbar to complete the Hologres schema-level data mapping.

Table-level metadata mapping

This feature maps a specific Hologres table to a specific table in MaxCompute as an external table. You can specify the path and a custom name for the external table.

  1. Go to the metadata mapping configuration page.

  2. In the Hologres project, find the Hologres table whose metadata you want to map to MaxCompute. Right-click the table name and select Map Metadata to MaxCompute. The metadata mapping configuration page appears.

  3. Configure the parameters for table-level metadata mapping.

    • Hologres (source)

      Parameter

      Description

      Source Object Type

      The source object type for table-level mapping. The default value is Hologres Table.

      Source Object Name

      The source Hologres table to be mapped. This is fixed to the currently selected Hologres table.

    • MaxCompute (destination)

      Parameter

      Description

      Instance search method

      Select how to find the MaxCompute project.

      • From DataWorks data source: Select the corresponding MaxCompute project by choosing the MaxCompute data source that is bound to the current workspace. You must also select a Data Sources and specify a name for the External Table to identify the external table in the target MaxCompute project. Ensure that the access identity specified for the data source has read and write permissions for the source Hologres table and the target MaxCompute project.

        Note

        Only users with the O&M or Workspace Administrator role can select production data sources.

      • I have permission: Select the target MaxCompute project from the projects that you have permission to access under your current Alibaba Cloud main account. After you select this option, you also need to set the External Table name to specify the target external table. Please ensure that you have read and write permissions for the source Hologres table and the target MaxCompute project.

      Data Sources

      When Project Find Method is set to From DataWorks data source, you need to manually select the destination MaxCompute data source.

      Project Name

      When Project Find Method is set to I have permission, you need to manually select the target MaxCompute project.

      Schema

      Specify the name of the external schema in the target MaxCompute project to which the metadata from the source Hologres schema is mapped.

      External Table

      Specify the name of the new external table to be created under the schema of the specified MaxCompute project. Data from the source table is mapped to this table. By default, the name is the same as the Hologres table name.

      Note

      Creating an external table is a one-time action. Metadata is not automatically refreshed. To refresh the metadata, you must delete the current external table and manually create the metadata mapping again.

      MaxCompute foreign table permissions

      The authentication and authorization method for the MaxCompute external table after the Hologres metadata is mapped.

      RoleARN

      When MaxCompute foreign table permissions is set to RAM Role, you must configure this parameter.

      You need to create a RAM role and enter its ARN here. For information about the required permissions for the RAM role, see Hologres external tables.

      Location

      The mapping address between the Hologres table and the MaxCompute table. This is automatically generated and cannot be modified.

      Lifecycle

      Set the lifecycle for the target external table.

      Field

      Configure the MaxCompute fields and MaxCompute data type in the target external table as needed.

      Note

      MaxCompute and Hologres have different data types, and some types cannot be mapped. For details, see Data type mapping between MaxCompute and Hologres.

  4. Click the top Run button to complete the Hologres table-level data mapping.

Next steps

  • Under Data Directory > MaxCompute, you can view mapped external schemas or external tables under a specified schema.

  • You can query data from a Hologres foreign table in a DataStudio > MaxCompute SQL node.