Dataphin lets you connect multiple scheduling clusters, manage their resources centrally, and assign them to custom resource groups per tenant—solving cross-region data transfer and resource isolation challenges.
Background information
Each tenant has a **default scheduling cluster** (the **Dataphin cluster**). Metadata warehouse tenants can also register custom clusters for task scheduling and control which tenants access each cluster. Custom clusters avoid the low security, high bandwidth costs, and low transfer efficiency of cross-region public-network data transmission.
For example, if Dataphin is deployed on-premises and you need to move data between two cloud databases in the same region: create a K8s cluster in Container Service (CC), register it with Dataphin, create a custom resource group under it, and select that group when building an integration task. Data then flows within the same region without routing through the Dataphin cluster.
Limits
-
Only deployments on the latest architecture support scheduling cluster management. Contact the product O&M team for details.
-
You can register up to **5** custom clusters (excluding the default Dataphin cluster).
Permissions
Only global roles with **resource configuration management permission** can manage scheduling clusters. Roles with **system settings viewing permission** can view settings.
Manage Scheduling Clusters
-
In the top menu bar, click Management Hub > System Settings.
-
In the left navigation pane, click Tenant Settings > Resource Settings.
-
On the Resource Settings page, click the Scheduling Cluster Management tab. The list shows clusters available to the current tenant (the default cluster by default).
The list shows each cluster's Scheduling Cluster Name/ID, Owner, Total Resources, Status, Description, Last Updated By/Time, and available operations.
-
Total Resources: Total available resources in the current cluster.
-
Status: The scheduling cluster status includes **Waiting for Resource Reporting**, **Waiting Timeout**, **Normal**, and **Abnormal**. For more information, see Scheduling Cluster Status.
-
-
(Optional) Filter clusters by name, owner, or status.
-
Perform the following operations on clusters in the list.
NoteThe default cluster is a system cluster and supports viewing only.
Operation
Description
Edit
Click the
icon in the Actions column. In the Edit Scheduling Cluster Basic Information dialog box, modify Cluster Basic Information, MaxCompute Connection Configuration, Metric Collection Configuration, and . Parameter details: Edit Registered Scheduling Cluster.Cluster Connection Configuration Guide
Click the
icon in the Actions column. In the Cluster Connection Configuration Guide dialog box, view the connection and authorization setup for custom clusters. Only successfully connected clusters can create custom resource groups. Detailed steps: How Dataphin Connects to Data Sources in Alibaba Cloud VPC by Registering Scheduling Clusters.Delete
Click the
icon in the Actions column to delete a cluster that has no custom resource groups.ImportantAfter deletion, the Agent application in the target cluster stops and cannot be recovered. **Contact the cluster owner** to delete the corresponding Pod. Deletion command:
sh uninstall.sh.
Edit Registered Scheduling Clusters
|
Parameter |
Description |
|
|
Cluster Basic Information |
Cluster Name, Owner, Description |
Same parameters as the creation operation. Register Scheduling Clusters. |
|
MaxCompute Connection Configuration |
Custom Endpoint |
Connection configuration for the current cluster to access MaxCompute compute sources. Defaults to the configuration in **Management Center** > **Compute Settings**. After enabling, a dedicated connection address is added for this cluster. If the cluster can reach MaxCompute's VPC Endpoint, prefer the VPC address. |
|
Cluster Region |
Select the cluster's region. Options match those in **Management Center** > **Compute Settings** > **Region**. |
|
|
Network Connection Method |
Select Alibaba Cloud VPC Network or Public Network Access. Note
Available only when the cluster's region is **Beijing**, **Shanghai**, **Shenzhen**, **Hangzhou**, or **Chengdu**. You must select an option **different** from the one in **Management Center** > **Compute Settings**. For example, if Compute Settings uses public network, only Alibaba Cloud VPC network is available here. |
|
|
Connection Endpoint |
|
|
|
Metric Collection Configuration |
Metric Collection |
Collects cluster metrics through Prometheus's HTTP API. Disabled by default. After enabling, view resource consumption trends in **O&M** > **Scheduling Resource Dashboard**. |
|
Cluster Type |
Select Alibaba Cloud ACK or Other. |
|
|
Prometheus HTTP API |
Enter the Prometheus HTTP API address. |
|
|
Authentication Type |
For Alibaba Cloud ACK, choose No Authentication, Token Authentication, or AccessKey Authentication. For Other, choose No Authentication or Token Authentication. Token authentication requires a token. AccessKey authentication requires an AccessKey ID and AccessKey Secret. |
|
Scheduling Cluster Status
Scheduling cluster statuses include **Waiting for Resource Reporting**, **Waiting Timeout**, **Normal**, and **Abnormal**.
Only **Normal** clusters can create custom resource groups. If a cluster changes to **Abnormal** after a resource group is created, that group becomes unusable.
|
Parameter |
Description |
|
Waiting for Resource Reporting |
The cluster is registered for connection configuration only, or configuration is complete but Dataphin has not received resource reports from the cluster. How Dataphin Connects to Data Sources in Alibaba Cloud VPC Through Registered Scheduling Clusters. |
|
Waiting Timeout |
If the cluster does not report within **2 hours** after registration. Contact the cluster owner to confirm whether the Agent is deployed or the cluster has available machines. |
|
Normal |
The cluster is registered, connection configured, and Dataphin continuously receives resource reports from it. |
|
Abnormal |
A previously Normal cluster that has been unresponsive for a period. Check whether the cluster's Agent is working or contact the cluster owner to verify available machines. |