Create a MaxCompute data source

更新时间: 2026-03-26 04:02:01

Create a MaxCompute data source to enable Dataphin to read data from or write data to MaxCompute. This topic describes how to create a MaxCompute data source.

Background information

MaxCompute is a big data computing service from Alibaba Cloud. It is an enterprise-level cloud data warehouse provided as Software as a Service (SaaS) for data analytics scenarios. It provides a fast and fully managed online data warehouse service that uses a serverless architecture. This eliminates the limitations of traditional data platforms in terms of resource extensibility and elasticity and minimizes your operations and maintenance (O&M) costs. This enables you to analyze and process large amounts of data in a cost-effective and efficient way. For more information, see What is MaxCompute.

Limits

MaxCompute data sources do not support access to external MaxCompute projects. For more information, see MaxCompute project overview.

Permission description

Only super administrators, data source administrators, domain architects, project administrators, and custom global roles with the Create Data Source permission can create data sources.

Only super administrators, system administrators, and custom global roles with the Compute Source Management - Create (RAM Role Proxy) permission can use a RAM role proxy when creating a MaxCompute data source.

Procedure

  1. On the Dataphin home page, in the top menu bar, click Management Hub > Data Source Management.

  2. On the Data Source page, click +Create Data Source.

  3. On the Create Data Source page, in the Big Data Storage section, select MaxCompute.

    If you have recently used MaxCompute, you can also select MaxCompute from the Recently Used section. You can also enter MaxCompute in the search box to perform a quick search.

  4. On the Create MaxCompute Data Source page, configure the connection parameters.

    1. Configure the basic information for the data source.

      Parameter

      Description

      Data Source Name

      Enter a name for the data source. The name must meet the following requirements:

      • It can contain only Chinese characters, uppercase and lowercase letters, digits, underscores (_), and hyphens (-).

      • The maximum length is 64 characters.

      Data Source Code

      After you configure the data source code, you can directly access Dataphin data source tables in Flink_SQL tasks or using the Dataphin Java Database Connectivity (JDBC) client. Use the format data_source_code.table_name or data_source_code.schema.table_name for quick access. To automatically switch data sources based on the task execution environment, use the variable format ${data_source_code}.table or ${data_source_code}.schema.table. For more information, see Develop Flink_SQL tasks.

      Important
      • The data source code cannot be modified after it is configured.

      • You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured.

      • In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.

      Data Source Description

      Enter a brief description of the data source. The description cannot exceed 128 characters.

      Data Source Configuration

      Select the data source to configure:

      • If your business data source distinguishes between production and development data sources, select Production + Development Data Source.

      • If your business data source does not distinguish between production and development data sources, select Production Data Source.

      Tag

      You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

    2. Configure the connection parameters between the data source and Dataphin.

      If you select Production + Development Data Source, configure the connection information for the Production And Development Data Sources. If you select Production Data Source, configure the connection information for only the Production Data Source.

      Note

      Typically, production and development data sources are configured separately to isolate the development environment from the production environment. This prevents development activities from affecting the production data source. However, Dataphin also supports configuring them as the same data source with identical parameter values.

      Parameter

      Description

      Endpoint

      The endpoint of MaxCompute. Select the endpoint based on your network environment and connection method.

      For information about how to obtain the endpoint, see Endpoint.

      Authentication Type

      Select RAM Role Proxy or AccessKey Authentication.

      If you select RAM Role Proxy, the system checks whether the AliyunServiceRoleForDataphinOnOdps role exists for the current account. If the role does not exist, an error is reported. You can follow the on-screen instructions to grant the required permissions.

      RAM Role

      The service-linked role is AliyunServiceRoleForDataphinOnOdps.

      Note

      This parameter is displayed only when you set Authentication Type to RAM Role Authentication.

      Access ID, Access Key

      The AccessKey ID and AccessKey secret of the account where the MaxCompute data source is located.

      For information about how to obtain them, see Obtain an AccessKey.

      Note

      This parameter can be configured only when you set Authentication Type to AccessKey Authentication.

      Project Name

      This is the MaxCompute project name, not the DataWorks workspace name.

      You can log on to the MaxCompute console, switch to the appropriate region in the upper-left corner, and then view the specific MaxCompute Project Name on the Project Management tab.

  5. Select a Default Resource Group. This resource group is used to run tasks related to the current data source, such as database SQL queries, offline full database migrations, and data previews.

  6. Click Test Connection, or click OK to save the configuration and create the MaxCompute data source.

    Click Test Connection to verify that the data source can connect to Dataphin. If you click OK, the system automatically tests the connections for all selected clusters. The data source can still be created even if all connection tests fail.

    If you select RAM Role Proxy as the authentication type, clicking Test Connection or OK triggers a process. If the connection has not been tested, the system automatically verifies the RAM role, creates a MaxCompute project role, and adds the RAM role to the MaxCompute project with the required permissions before running the connection test. If the connection has been tested, the system runs the connection test directly.

    • Verify RAM role: The system checks if the service-linked role (SLR) named AliyunServiceRoleForDataphinOnOdps exists in your account. If the role exists, this step is skipped. If the role does not exist, the verification fails. You can then click Authorize Now and follow the on-screen instructions to grant the permissions.

    • Create MaxCompute project role: The system uses the SLR to call the MaxCompute OpenAPI and create a project role in the selected MaxCompute project.

    • Add RAM role to MaxCompute project and grant permissions: The system uses the SLR to add the Dataphin AliyunServiceRoleForDataphinOnOdps role to the selected MaxCompute project.

    Important

    If the connection test fails, you can troubleshoot the issue by reviewing common network connectivity problems. For more information, see Network connectivity solutions.

上一篇: Big data storage data sources 下一篇: Create a Hive data source
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈