ACK One allows you to connect external Kubernetes clusters, such as those in your data center or on a third-party public cloud, to Container Service for Kubernetes (ACK) for unified management, which allows you to build and operate a hybrid cloud architecture. This topic outlines the key features and use cases of registered clusters.
Registered cluster console
Features
If you manage Kubernetes clusters across different environments, such as Container Service for Kubernetes (ACK) clusters, self-managed Kubernetes clusters in your data center, or clusters on a third-party public cloud, you can use ACK One to register these clusters for unified management. Consider using registered clusters if you have the following requirements for your hybrid or multi-cloud architecture:
-
Hybrid cloud elasticity: Elastically scale your self-managed Kubernetes clusters by adding cloud-based resources, such as Elastic Compute Service (ECS) instances, physical servers, or serverless resources like Elastic Container Instance (ECI). The ack-co-scheduler provides flexible scaling policies for managing resources across your data center and the cloud, letting you prioritize resource scale-out, scale in on demand, proportionally distribute pod replicas, and elastically scale GPU-based node pools.
-
Consistent operational experience with ACK: Manage all your Kubernetes clusters—whether on Alibaba Cloud or in your data center—from a single console for a consistent operational experience and unified security governance. You can manage clusters and applications, centralize logs, monitoring, and alerts, and apply consistent authorization policies using Alibaba Cloud accounts, RAM users, and RAM roles.
-
AI and big data capabilities: Improve computing efficiency by 30% to 40% with topology-aware CPU scheduling and NUMA awareness for mainstream servers. Increase GPU resource utilization by up to 300% through GPU sharing and scheduling. Scale heterogeneous resources flexibly with unified management across cloud and on-premises environments. Accelerate data access by up to 10x and reduce bandwidth usage by 90% by using Fluid to unify storage access in a hybrid cloud distributed cache.
-
Backup and disaster recovery: An integrated cloud solution for backup, recovery, and migration provides disaster recovery for both data and applications, significantly improving business continuity.
Use cases
Use case 1: Hybrid cloud with registered clusters
Description
-
Self-managed clusters in data centers: Connect cluster networks to share resources between your on-premises and cloud environments.
-
On-demand scaling of cloud resources and applications: During peak business hours, rapidly scale out resources in the cloud and direct a portion of your traffic to the cloud.
Use case 2: Consistent experience for on-premises clusters
Description
-
Consistent operational experience: Extend the unified operational capabilities of ACK to clusters in your data centers and on third-party public clouds.
-
Enhanced observability: Gain a cloud-consistent operational experience with support for log, monitoring, and event collection.
-
Improved security: Enable auditing, security inspection, node risk detection, and policy governance with a single click.
-
Microservice governance: Microservices Engine (MSE) and Service Mesh (ASM) provide microservice governance capabilities.
Use case 3: Data disaster recovery with registered clusters
Description
-
Application migration to the cloud: Provides consistent application backups and recovery in seconds across regions and data centers to help you quickly migrate your business applications to the cloud.
-
Data disaster recovery: Provides stateful application backups across regions and data centers with support for configurable backup and recovery policies. Continuously back up data to the cloud for disaster recovery to improve protection against ransomware.
-
Business disaster recovery: Provides geo-redundant and scheduled backup capabilities for applications and data across regions and data centers.
-
Active geo-redundancy: Provides a Kubernetes-compatible solution to quickly build a disaster recovery system with three centers across two regions, which helps you build a high-availability system.
Use case 4: Co-scheduling for AI and big data
Description
-
AI algorithm development: Provides comprehensive management of tasks, quotas, and observability.
-
AI training: Supports topology-aware scheduling and a rich set of task scheduling policies to improve training efficiency. The compute-storage separation architecture significantly speeds up distributed data training. It also supports cross-cluster job scheduling and provides multi-cluster optimized distribution and scheduling for jobs such as TensorFlow, Spark, and CronJob.
-
AI inference: Provides GPU sharing, which can increase resource utilization by approximately 300%. It supports elastic scaling of heterogeneous resources and provides unified elastic scheduling management for both cloud and on-premises environments.
-
Intelligent CPU scheduling: Provides intelligent CPU scheduling and NUMA awareness for bare metal servers.