Overview of ACK One-Container Service for Kubernetes(ACK)-阿里云帮助中心

Distributed Cloud Container Platform for Kubernetes (ACK One) manages Kubernetes clusters across on-premises data centers, third-party clouds, and Alibaba Cloud regions from a unified control plane.

Open the ACK One console

Cluster types

ACK One provides three cluster types:

Cluster type	What it is
Registered clusters	Any external Kubernetes cluster—on-premises or third-party cloud—connected to the ACK console for centralized management and Alibaba Cloud integration
Multi-cluster Fleet instances	Unified control plane that groups multiple Kubernetes clusters for coordinated application distribution, traffic management, and monitoring
Kubernetes clusters for distributed Argo workflows	Serverless clusters built on Elastic Container Instance (ECI) for running Argo Workflows at scale, with cost-optimized and event-driven execution

Capabilities

Manage clusters from one place

Connect clusters from any provider or location to a single console and API surface.
Centrally enforce security policies, access controls, and configuration inspections across all clusters.
View health and cost metrics for all clusters in one global monitoring dashboard.

Scale resources on demand

Burst workloads from on-premises clusters to Alibaba Cloud by adding Elastic Compute Service (ECS) instances or ECI to external Kubernetes clusters.
Use the ACK scheduler for advanced scheduling: gang scheduling, topology-aware CPU scheduling, and ECI-based scheduling.
Accelerate data access and reduce bandwidth usage with ACK Fluid distributed cache in compute-storage decoupled environments.
Scale cloud resources automatically to handle traffic fluctuations, or set scheduled scaling to improve cost-effectiveness.

Protect and recover applications

Back up and restore applications and data across regions or from data centers to the cloud with the Backup center—no additional setup required.
Set automated backup and restoration policies to keep applications protected.
Build active geo-redundancy across three data centers in two zones for business continuity.

Distribute applications across multiple clusters

Host open source ArgoCD in ACK One and distribute multi-cluster applications through GitOps.
Apply different configurations per cluster while deploying from the same Git repository.
Run jobs across multiple clusters on a schedule.

Manage traffic at the fleet level

Route north-south traffic across clusters with MSE cloud-native gateways.
Create multi-cluster Services to manage east-west traffic.
Configure Global Ingresses with Layer 7 routing rules based on weights and pod replica counts, with automatic fallback.

Run AI and big data workloads

Deploy Alibaba Cloud-verified enterprise components in Kubernetes clusters to enhance security, scheduling efficiency, and AI and big data computing.
Manage AI training jobs, resource quotas, and observability from a unified interface.
Improve GPU utilization by approximately 300% with GPU sharing.
Accelerate distributed training with compute-storage decoupling and cross-cluster scheduling for Spark, Kubernetes, and TensorFlow jobs.
Enable intelligent CPU scheduling and non-uniform memory access (NUMA) awareness on ECS Bare Metal instances.

Run large-scale workflows cost-effectively

Pay only for data plane usage—Argo Workflows control planes are free of charge.
Use preemptible instances to reduce compute costs further.
Automatically adjust resource specifications through load-aware resource prediction.
Handle thousands of concurrent workflows and tens of thousands of computing tasks.
Trigger workflows automatically from Git, Message Service (MNS), or Object Storage Service (OSS) events.
Achieve more than 20 GB/s aggregated read bandwidth with distributed cache across regions.

Use cases

Connect on-premises clusters and scale to the cloud

Register on-premises clusters to connect data centers to Alibaba Cloud and burst workloads to the cloud during traffic peaks.

Extend on-premises clusters with Alibaba Cloud services

Add Alibaba Cloud observability, security, and microservice governance to clusters in data centers or third-party clouds:

Observability: Collect logs, metrics, and events with consistent O&M across environments.
Security: Enable auditing, security inspection, node risk detection, and policy governance.
Microservice governance: Use Service Mesh (ASM) and Microservices Engine (MSE) for traffic control and service governance.

Implement disaster recovery across hybrid cloud, regions, or zones

Back up stateful applications and data across regions or from on-premises to the cloud.
Schedule automated backups and define restoration policies to meet recovery objectives.
Build active geo-redundancy with three data centers across two zones for Kubernetes-native business continuity.

Accelerate AI and big data workloads

AI algorithm development: Manage AI jobs, quotas, and observability from one console.
AI training: Use topology-aware scheduling, compute-storage decoupling, and cross-cluster scheduling for Spark, Kubernetes, and TensorFlow jobs.
AI inference: Improve GPU utilization by approximately 300% with GPU sharing, with autoscaling across cloud and on-premises resources.
Intelligent CPU scheduling: Run NUMA-aware workloads on ECS Bare Metal instances for latency-sensitive jobs.

Distribute applications to multiple clusters through GitOps

Deploy applications from Git repositories to multiple clusters with a Fleet instance and hosted ArgoCD:

Developers need only Git repository permissions—no direct Kubernetes cluster access required.
Apply version control, change approval, code rollback, and audit logs to every deployment.
Keep applications in clusters continuously synchronized with the state declared in Git.
Deploy the same application with different configurations to different clusters.

Implement zone-disaster recovery with multi-cluster gateways

Route traffic intelligently across clusters to reduce costs and improve resilience:

Use multi-cluster gateways to schedule north-south traffic based on availability and cost.
Create Global Ingresses with Layer 7 routing rules controlled by weight and pod replica count, with automatic fallback when a cluster becomes unavailable.

Orchestrate large-scale jobs and complex workflows with Argo Workflows

Run simulation, scientific computing, data processing, and continuous integration workloads on a managed serverless Argo Workflows control plane:

Use resources across multiple regions and zones.
Reduce costs with preemptible instances and pay-per-use data plane billing.
Decouple computing and storage with distributed cache to accelerate job execution.

Next steps

Contact us

If you have questions about ACK One, join the DingTalk group 35688562.