Distributed Cloud Container Platform for Kubernetes (ACK One) manages Kubernetes clusters across on-premises data centers, third-party clouds, and Alibaba Cloud regions from a unified control plane.
Cluster types
ACK One provides three cluster types:
| Cluster type | What it is |
|---|---|
| Registered clusters | Any external Kubernetes cluster—on-premises or third-party cloud—connected to the ACK console for centralized management and Alibaba Cloud integration |
| Multi-cluster Fleet instances | Unified control plane that groups multiple Kubernetes clusters for coordinated application distribution, traffic management, and monitoring |
| Kubernetes clusters for distributed Argo workflows | Serverless clusters built on Elastic Container Instance (ECI) for running Argo Workflows at scale, with cost-optimized and event-driven execution |
Capabilities
Manage clusters from one place
-
Connect clusters from any provider or location to a single console and API surface.
-
Centrally enforce security policies, access controls, and configuration inspections across all clusters.
-
View health and cost metrics for all clusters in one global monitoring dashboard.
Scale resources on demand
-
Burst workloads from on-premises clusters to Alibaba Cloud by adding Elastic Compute Service (ECS) instances or ECI to external Kubernetes clusters.
-
Use the ACK scheduler for advanced scheduling: gang scheduling, topology-aware CPU scheduling, and ECI-based scheduling.
-
Accelerate data access and reduce bandwidth usage with ACK Fluid distributed cache in compute-storage decoupled environments.
-
Scale cloud resources automatically to handle traffic fluctuations, or set scheduled scaling to improve cost-effectiveness.
Protect and recover applications
-
Back up and restore applications and data across regions or from data centers to the cloud with the Backup center—no additional setup required.
-
Set automated backup and restoration policies to keep applications protected.
-
Build active geo-redundancy across three data centers in two zones for business continuity.
Distribute applications across multiple clusters
-
Host open source ArgoCD in ACK One and distribute multi-cluster applications through GitOps.
-
Apply different configurations per cluster while deploying from the same Git repository.
-
Run jobs across multiple clusters on a schedule.
Manage traffic at the fleet level
-
Route north-south traffic across clusters with MSE cloud-native gateways.
-
Create multi-cluster Services to manage east-west traffic.
-
Configure Global Ingresses with Layer 7 routing rules based on weights and pod replica counts, with automatic fallback.
Run AI and big data workloads
-
Deploy Alibaba Cloud-verified enterprise components in Kubernetes clusters to enhance security, scheduling efficiency, and AI and big data computing.
-
Manage AI training jobs, resource quotas, and observability from a unified interface.
-
Improve GPU utilization by approximately 300% with GPU sharing.
-
Accelerate distributed training with compute-storage decoupling and cross-cluster scheduling for Spark, Kubernetes, and TensorFlow jobs.
-
Enable intelligent CPU scheduling and non-uniform memory access (NUMA) awareness on ECS Bare Metal instances.
Run large-scale workflows cost-effectively
-
Pay only for data plane usage—Argo Workflows control planes are free of charge.
-
Use preemptible instances to reduce compute costs further.
-
Automatically adjust resource specifications through load-aware resource prediction.
-
Handle thousands of concurrent workflows and tens of thousands of computing tasks.
-
Trigger workflows automatically from Git, Message Service (MNS), or Object Storage Service (OSS) events.
-
Achieve more than 20 GB/s aggregated read bandwidth with distributed cache across regions.
Use cases
Connect on-premises clusters and scale to the cloud
Register on-premises clusters to connect data centers to Alibaba Cloud and burst workloads to the cloud during traffic peaks.
Extend on-premises clusters with Alibaba Cloud services
Add Alibaba Cloud observability, security, and microservice governance to clusters in data centers or third-party clouds:
-
Observability: Collect logs, metrics, and events with consistent O&M across environments.
-
Security: Enable auditing, security inspection, node risk detection, and policy governance.
-
Microservice governance: Use Service Mesh (ASM) and Microservices Engine (MSE) for traffic control and service governance.
Implement disaster recovery across hybrid cloud, regions, or zones
-
Back up stateful applications and data across regions or from on-premises to the cloud.
-
Schedule automated backups and define restoration policies to meet recovery objectives.
-
Build active geo-redundancy with three data centers across two zones for Kubernetes-native business continuity.
Accelerate AI and big data workloads
-
AI algorithm development: Manage AI jobs, quotas, and observability from one console.
-
AI training: Use topology-aware scheduling, compute-storage decoupling, and cross-cluster scheduling for Spark, Kubernetes, and TensorFlow jobs.
-
AI inference: Improve GPU utilization by approximately 300% with GPU sharing, with autoscaling across cloud and on-premises resources.
-
Intelligent CPU scheduling: Run NUMA-aware workloads on ECS Bare Metal instances for latency-sensitive jobs.
Distribute applications to multiple clusters through GitOps
Deploy applications from Git repositories to multiple clusters with a Fleet instance and hosted ArgoCD:
-
Developers need only Git repository permissions—no direct Kubernetes cluster access required.
-
Apply version control, change approval, code rollback, and audit logs to every deployment.
-
Keep applications in clusters continuously synchronized with the state declared in Git.
-
Deploy the same application with different configurations to different clusters.
Implement zone-disaster recovery with multi-cluster gateways
Route traffic intelligently across clusters to reduce costs and improve resilience:
-
Use multi-cluster gateways to schedule north-south traffic based on availability and cost.
-
Create Global Ingresses with Layer 7 routing rules controlled by weight and pod replica count, with automatic fallback when a cluster becomes unavailable.
Orchestrate large-scale jobs and complex workflows with Argo Workflows
Run simulation, scientific computing, data processing, and continuous integration workloads on a managed serverless Argo Workflows control plane:
-
Use resources across multiple regions and zones.
-
Reduce costs with preemptible instances and pay-per-use data plane billing.
-
Decouple computing and storage with distributed cache to accelerate job execution.
Next steps
Contact us
If you have questions about ACK One, join the DingTalk group 35688562.