To access Hologres data from MaxCompute, use the metadata mapping feature in the data catalog to create MaxCompute external tables. This method allows you to read Hologres data directly without importing it, saving compute and storage resources while enabling flexible and efficient data processing.
Background
MaxCompute allows you to query and analyze data in external systems like Hologres by using external tables. With a PostgreSQL JDBC driver and RAM role authorization, you can query external data directly without importing it into MaxCompute.
DataWorks simplifies this process through metadata mapping, which leverages the following MaxCompute capabilities:
-
Schema-level metadata mapping: Uses the MaxCompute external schema feature.
-
Table-level metadata mapping: Uses the MaxCompute external table feature.
Prerequisites
-
A MaxCompute project and a Hologres instance have been created.
-
The Hologres instance has been added to the data catalog.
-
Schema-level metadata mapping requires a MaxCompute external data source. You must create a Hologres-type external data source in MaxCompute.
ImportantWhen you create a MaxCompute external data source:
-
For Host, enter the classic network address. VPC addresses are not supported.
-
For Authentication Method, only the RAM role method is supported. The ExecuteWithUserAuth method is not supported.
-
Access control
DataWorks determines the access identity based on the source of the MaxCompute project and verifies permissions accordingly.
|
Access identity source |
MaxCompute permission check |
Hologres permission check |
|
|
|
Limits
-
Only data in Hologres internal databases can be mapped to MaxCompute.
-
For limits on using Hologres external tables in MaxCompute, see Hologres external tables.
-
MaxCompute and Hologres use different data types. Some Hologres types cannot be mapped and are automatically skipped during the mapping process. Carefully review the Data type mapping between MaxCompute and Hologres.
Entry point
-
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
-
In the left navigation bar, click
to go to Data Directory. -
In the Hologres directory, locate the schema or table that requires metadata mapping to MaxCompute, right-click it, and select Map Metadata to MaxCompute.
Schema-level metadata mapping
This feature maps the metadata of a Hologres schema to an external schema in MaxCompute. Before you start, we recommend that you read the Data Lakehouse 2.0 User Guide to learn about MaxCompute external data sources and external schemas.
You can map metadata only to an internal MaxCompute project that has the schema feature enabled.
Prepare a Hologres external data source
Schema-level metadata mapping works by creating a MaxCompute external schema that synchronizes metadata from an external data source. Therefore, you must first create a MaxCompute external data source in DataWorks that points to the target Hologres database. You must then associate this data source with the specified MaxCompute project to establish the mapping.
For more information about how to create a MaxCompute external data source, see Data Lakehouse 2.0 User Guide.
When you create the external data source, enter the classic network address for Host. VPC addresses are not supported. For DB, specify the Hologres database whose metadata you want to map.
DataWorks: Configure Hologres schema-level metadata mapping
-
In the Hologres project, find the schema that you want to map to MaxCompute. Right-click the schema name and select Map Metadata to MaxCompute to open the metadata mapping configuration page.
-
Configure the parameters for schema-level mapping.
-
Hologres (source)
Parameter
Description
Source Object Type
The type of object to map to MaxCompute. This is fixed to
Hologres Schema.Source Object Name
The name of the Hologres schema to map to MaxCompute. This is fixed to the currently selected schema.
Format:
<hologres_database>.<hologres_schema>.NoteYou must create a MaxCompute external data source of the Hologres type, and set the default Hologres database for the external data source to
<hologres_database>. For more information, see MaxCompute: Prepare a Hologres-type external data source. -
MaxCompute (destination)
Parameter
Description
Project Find Method
Select how to find the MaxCompute project.
-
From DataWorks data source: Select the corresponding MaxCompute instance by selecting the MaxCompute data source that is bound to the current workspace. After you select this method, you also need to select a Data Sources and set an External schema name to create an external schema in the target MaxCompute project.
NoteOnly users with the O&M or Workspace Administrator role can select production data sources.
-
I have permission: Select a target MaxCompute project for mapping from the MaxCompute projects that you are authorized to access under your current Alibaba Cloud primary account. If you select this option, you also need to set the External schema name to specify the target Schema.
Data Sources
When Project Find Method is set to From DataWorks data source, you need to manually select the MaxCompute data source.
Project Name
When Project Find Method is set to I have permission, you need to manually select a MaxCompute project.
External schema name
Specify the name for the MaxCompute external schema to which the metadata from the source Hologres schema is mapped.
External Data Source
Select the MaxCompute external data source that is already connected to the source Hologres database.
ImportantWhen you create a MaxCompute external data source:
-
For Host, enter the classic network address. VPC addresses are not supported.
-
For Authentication Method, only the RAM role method is supported. The ExecuteWithUserAuth method is not supported.
Auth
Automatically populated based on the selected external data source.
Host:port
Database
-
-
-
Click the Run button on the top toolbar to complete the Hologres schema-level data mapping.
Table-level metadata mapping
This feature maps a specific Hologres table to a specific table in MaxCompute as an external table. You can specify the path and a custom name for the external table.
-
In the Hologres project, find the Hologres table whose metadata you want to map to MaxCompute. Right-click the table name and select Map Metadata to MaxCompute. The metadata mapping configuration page appears.
-
Configure the parameters for table-level metadata mapping.
-
Hologres (source)
Parameter
Description
Source Object Type
The source object type for table-level mapping. The default value is
Hologres Table.Source Object Name
The source Hologres table to be mapped. This is fixed to the currently selected Hologres table.
-
MaxCompute (destination)
Parameter
Description
Instance search method
Select how to find the MaxCompute project.
-
From DataWorks data source: Select the corresponding MaxCompute project by choosing the MaxCompute data source that is bound to the current workspace. You must also select a Data Sources and specify a name for the External Table to identify the external table in the target MaxCompute project. Ensure that the access identity specified for the data source has read and write permissions for the source Hologres table and the target MaxCompute project.
NoteOnly users with the O&M or Workspace Administrator role can select production data sources.
-
I have permission: Select the target MaxCompute project from the projects that you have permission to access under your current Alibaba Cloud main account. After you select this option, you also need to set the External Table name to specify the target external table. Please ensure that you have read and write permissions for the source Hologres table and the target MaxCompute project.
Data Sources
When Project Find Method is set to From DataWorks data source, you need to manually select the destination MaxCompute data source.
Project Name
When Project Find Method is set to I have permission, you need to manually select the target MaxCompute project.
Schema
Specify the name of the external schema in the target MaxCompute project to which the metadata from the source Hologres schema is mapped.
External Table
Specify the name of the new external table to be created under the schema of the specified MaxCompute project. Data from the source table is mapped to this table. By default, the name is the same as the Hologres table name.
NoteCreating an external table is a one-time action. Metadata is not automatically refreshed. To refresh the metadata, you must delete the current external table and manually create the metadata mapping again.
MaxCompute foreign table permissions
The authentication and authorization method for the MaxCompute external table after the Hologres metadata is mapped.
-
RamRole: Uses the STS mode for authentication and authorization.
RoleARN
When MaxCompute foreign table permissions is set to RAM Role, you must configure this parameter.
You need to create a RAM role and enter its ARN here. For information about the required permissions for the RAM role, see Hologres external tables.
Location
The mapping address between the Hologres table and the MaxCompute table. This is automatically generated and cannot be modified.
Lifecycle
Set the lifecycle for the target external table.
Field
Configure the MaxCompute fields and MaxCompute data type in the target external table as needed.
NoteMaxCompute and Hologres have different data types, and some types cannot be mapped. For details, see Data type mapping between MaxCompute and Hologres.
-
-
-
Click the top Run button to complete the Hologres table-level data mapping.
Next steps
-
Under , you can view mapped external schemas or external tables under a specified schema.
-
You can query data from a Hologres foreign table in a node.