Manage Scheduling Clusters

更新时间:
复制 MD 格式

Dataphin lets you connect multiple scheduling clusters, manage their resources centrally, and assign them to custom resource groups per tenant—solving cross-region data transfer and resource isolation challenges.

Background information

image

Each tenant has a **default scheduling cluster** (the **Dataphin cluster**). Metadata warehouse tenants can also register custom clusters for task scheduling and control which tenants access each cluster. Custom clusters avoid the low security, high bandwidth costs, and low transfer efficiency of cross-region public-network data transmission.

For example, if Dataphin is deployed on-premises and you need to move data between two cloud databases in the same region: create a K8s cluster in Container Service (CC), register it with Dataphin, create a custom resource group under it, and select that group when building an integration task. Data then flows within the same region without routing through the Dataphin cluster.

Limits

  • Only deployments on the latest architecture support scheduling cluster management. Contact the product O&M team for details.

  • You can register up to **5** custom clusters (excluding the default Dataphin cluster).

Permissions

Only global roles with **resource configuration management permission** can manage scheduling clusters. Roles with **system settings viewing permission** can view settings.

Manage Scheduling Clusters

  1. In the top menu bar, click Management Hub > System Settings.

  2. In the left navigation pane, click Tenant Settings > Resource Settings.

  3. On the Resource Settings page, click the Scheduling Cluster Management tab. The list shows clusters available to the current tenant (the default cluster by default).

    The list shows each cluster's Scheduling Cluster Name/ID, Owner, Total Resources, Status, Description, Last Updated By/Time, and available operations.

    • Total Resources: Total available resources in the current cluster.

    • Status: The scheduling cluster status includes **Waiting for Resource Reporting**, **Waiting Timeout**, **Normal**, and **Abnormal**. For more information, see Scheduling Cluster Status.

  4. (Optional) Filter clusters by name, owner, or status.

  5. Perform the following operations on clusters in the list.

    Note

    The default cluster is a system cluster and supports viewing only.

    Operation

    Description

    Edit

    Click the image.png icon in the Actions column. In the Edit Scheduling Cluster Basic Information dialog box, modify Cluster Basic Information, MaxCompute Connection Configuration, Metric Collection Configuration, and . Parameter details: Edit Registered Scheduling Cluster.

    Cluster Connection Configuration Guide

    Click the image.png icon in the Actions column. In the Cluster Connection Configuration Guide dialog box, view the connection and authorization setup for custom clusters. Only successfully connected clusters can create custom resource groups. Detailed steps: How Dataphin Connects to Data Sources in Alibaba Cloud VPC by Registering Scheduling Clusters.

    Delete

    Click the image.png icon in the Actions column to delete a cluster that has no custom resource groups.

    Important

    After deletion, the Agent application in the target cluster stops and cannot be recovered. **Contact the cluster owner** to delete the corresponding Pod. Deletion command: sh uninstall.sh.

Edit Registered Scheduling Clusters

Parameter

Description

Cluster Basic Information

Cluster Name, Owner, Description

Same parameters as the creation operation. Register Scheduling Clusters.

MaxCompute Connection Configuration

Custom Endpoint

Connection configuration for the current cluster to access MaxCompute compute sources. Defaults to the configuration in **Management Center** > **Compute Settings**. After enabling, a dedicated connection address is added for this cluster.

If the cluster can reach MaxCompute's VPC Endpoint, prefer the VPC address.

Cluster Region

Select the cluster's region. Options match those in **Management Center** > **Compute Settings** > **Region**.

Network Connection Method

Select Alibaba Cloud VPC Network or Public Network Access.

Note

Available only when the cluster's region is **Beijing**, **Shanghai**, **Shenzhen**, **Hangzhou**, or **Chengdu**. You must select an option **different** from the one in **Management Center** > **Compute Settings**. For example, if Compute Settings uses public network, only Alibaba Cloud VPC network is available here.

Connection Endpoint

  • When Cluster Region is Other, the address defaults to the Endpoint in Management Hub > Compute Settings. **You must modify it manually**.

  • When Cluster Region is Beijing, Shanghai, Shenzhen, Hangzhou, or Chengdu, the system auto-generates the Endpoint (not editable).

Metric Collection Configuration

Metric Collection

Collects cluster metrics through Prometheus's HTTP API. Disabled by default. After enabling, view resource consumption trends in **O&M** > **Scheduling Resource Dashboard**.

Cluster Type

Select Alibaba Cloud ACK or Other.

Prometheus HTTP API

Enter the Prometheus HTTP API address.

Authentication Type

For Alibaba Cloud ACK, choose No Authentication, Token Authentication, or AccessKey Authentication. For Other, choose No Authentication or Token Authentication.

Token authentication requires a token. AccessKey authentication requires an AccessKey ID and AccessKey Secret.

Scheduling Cluster Status

Scheduling cluster statuses include **Waiting for Resource Reporting**, **Waiting Timeout**, **Normal**, and **Abnormal**.

Note

Only **Normal** clusters can create custom resource groups. If a cluster changes to **Abnormal** after a resource group is created, that group becomes unusable.

Parameter

Description

Waiting for Resource Reporting

The cluster is registered for connection configuration only, or configuration is complete but Dataphin has not received resource reports from the cluster. How Dataphin Connects to Data Sources in Alibaba Cloud VPC Through Registered Scheduling Clusters.

Waiting Timeout

If the cluster does not report within **2 hours** after registration. Contact the cluster owner to confirm whether the Agent is deployed or the cluster has available machines.

Normal

The cluster is registered, connection configured, and Dataphin continuously receives resource reports from it.

Abnormal

A previously Normal cluster that has been unresponsive for a period. Check whether the cluster's Agent is working or contact the cluster owner to verify available machines.