This document covers the 2024 feature updates and technical changes for Alibaba Cloud Container Service for Kubernetes (ACK) and its sub-products.
December 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | OCI artifact signing and signature verification based on Notation and Ratify | The notation-alibabacloud-secret-manager component signs Open Container Initiative (OCI) artifacts stored in Container Registry using keys managed by Key Management Service (KMS). Install Ratify in your cluster to verify image signatures and block images with invalid signatures. | All regions | Use Notation and Ratify for OCI artifact signing and signature verification |
| Storage monitoring | Monitor storage resources across your cluster, nodes, pods, and externally mounted volumes using Managed Service for Prometheus. After enabling Managed Service for Prometheus, out-of-the-box dashboards display real-time storage usage. | All regions | View storage monitoring information | |
| Workload stability and performance analysis in cost insights | Cost insights now identifies stability, performance, and cost risks in your workloads. It sorts pods by resource utilization and provides detailed resource configuration views for pods with Burstable and BestEffort Quality of Service (QoS) classes. | All regions | Use cost insights to identify risks for cluster workloads | |
| Multi-dimensional cost aggregation and idle cost policies in the cost API | The cost API supports new parameters for filtering and aggregating cost data by pod label or node name. Customize idle cost allocation policies and dimensions to manage and optimize costs more flexibly. | All regions | Call the Cost V2 API | |
| GPU fault alerting and solutions | ACK provides end-to-end GPU fault management: monitoring, diagnostics, alerting, and recovery mechanisms to resolve GPU faults in ACK clusters. | All regions | Configure GPU fault alerting and solutions | |
| Batch task orchestration with Argo Workflows | Argo Workflows is a Kubernetes-native workflow engine for orchestrating concurrent jobs using YAML or Python. It supports CI/CD pipelines, data processing, and machine learning workloads. Install the Argo Workflows component and use the Argo CLI or console to create and manage workflows. | All regions | Enable batch task orchestration | |
| ACK One | Geo-disaster recovery based on ALB multi-cluster gateways | ACK One supports geo-disaster recovery using Application Load Balancer (ALB) multi-cluster gateways to protect against region-level disasters such as floods and earthquakes. Note that this may increase response latency, resource costs, and maintenance costs. | All regions | Use ALB multi-cluster gateways of ACK One to implement geo-disaster recovery |
| ACK Edge | Virtual nodes | ACK Edge clusters now support virtual nodes. Schedule pods directly to elastic container instances that act as virtual nodes—no need to reserve or maintain node pools. This improves elasticity and reduces resource costs compared to pre-provisioning Elastic Compute Service (ECS) instances. | All regions | |
| P2P acceleration | ACK Edge clusters support P2P acceleration to speed up image pulls and reduce application deployment time. | All regions | Install a P2P acceleration agent in an ACK cluster | |
| Kubernetes 1.30 support | ACK Edge clusters now support Kubernetes 1.30. | All regions | Release notes for ACK Edge of Kubernetes 1.30 | |
| ACK Lingjun | Image acceleration | The aliyun-acr-acceleration-suite component enables on-demand image loading in Lingjun clusters. It automatically converts source images to accelerated images and decompresses data on demand, so pods start without downloading or decompressing the full image. | All regions | aliyun-acr-acceleration-suite |
November 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | eRDMA support | ACK eRDMA Controller enables elastic Remote Direct Memory Access (eRDMA) in your clusters. It manages eRDMA interface (ERI) assignments and lets you specify eRDMA settings in pod configurations. | All regions | ACK eRDMA Controller |
| New releases of ack-secret-manager and secrets-store-csi-driver-provider-alibaba-cloud | New versions of ack-secret-manager and secrets-store-csi-driver-provider-alibaba-cloud are available. Install them from the Marketplace page in the ACK console. | All regions | Use ack-secret-manager to import OOS encryption parameters and Use csi-secrets-store-provider-alibabacloud to import OOS encryption parameters | |
| Remote Shuffle Service for Spark jobs via Apache Celeborn | Apache Celeborn enables Remote Shuffle Service (RSS) for Spark jobs by processing intermediate data—shuffle data and spilled data—for big data compute engines, improving performance, stability, and flexibility. | All regions | Use Celeborn to enable RSS for Spark jobs | |
| Log management for Spark jobs | Use Simple Log Service to collect and manage logs from Spark jobs running in ACK clusters. | All regions | Use Simple Log Service to collect the logs of Spark jobs | |
| ossfs troubleshooting | Object Storage Service (OSS) volumes are Filesystem in Userspace (FUSE) file systems mounted via ossfs. Analyze debug logs or pod logs to troubleshoot ossfs exceptions based on the mode in which ossfs runs. | All regions | Troubleshoot OSSFS exceptions | |
| ACK Serverless | Custom CoreDNS configurations | Customize managed CoreDNS settings for ACK Serverless clusters: specify an external DNS server to improve resolution speed, or add static IP mappings to the local hosts file for domain names with fixed addresses. | All regions | Configure custom parameters for managed CoreDNS |
| ACK One | Network policies for Elastic Container Instance-based pods in registered clusters | Kubernetes network policies are now supported for Elastic Container Instance (ECI)-based pods in registered clusters. Use network policies to control traffic to specific applications by IP address or port. | All regions | Use network policies on elastic container instances |
| Migration from self-managed Argo CD to ACK One GitOps | Use onectl to migrate clusters, repositories, and applications from a self-managed Argo CD instance to ACK One GitOps in bulk, instead of migrating resources one by one. | All regions | Migrate data from self-managed Argo CD to ACK One GitOps | |
| Preemptible ECI creation in registered clusters | Registered clusters now support preemptible elastic container instances. Use them to run short-term jobs or stateless, fault-tolerant applications at reduced cost. | All regions | Create a preemptible elastic container instance | |
| Hybrid disaster recovery based on ALB multi-cluster gateways | ACK One supports hybrid disaster recovery using ALB multi-cluster gateways. Route traffic across clusters deployed in data centers or third-party platforms and perform seamless failovers for active-zone redundancy. | All regions | Use MSE multi-cluster gateways to implement hybrid disaster recovery in ACK One | |
| Zone-disaster recovery based on ALB multi-cluster gateways | ACK One ALB multi-cluster gateways work with ACK One GitOps or the multi-cluster application distribution feature to implement zone-disaster recovery and automatically switch traffic when a fault occurs. | All regions | Zone-disaster recovery based on ALB multi-cluster gateways of ACK One | |
| Cloud-native AI suite | FUSE client monitoring for Fluid JindoRuntime | Fluid now collects metrics from multiple JindoRuntime caching engines and FUSE clients, and displays them in out-of-the-box JindoRuntime monitoring dashboards. | All regions | Enable and use the Fluid JindoRuntime FUSE client for monitoring |
Performance analysis and troubleshooting for large models using PyTorch Profiler | This practice describes how to use PyTorch Profiler with TensorBoard to analyze model performance and optimize training across data loading, data transfer, GPU computing, and model compilation. | All | Performance analysis and troubleshooting for large models using PyTorch Profiler | |
Performance analysis and optimization for AI applications using Nsight Systems | In deep learning, Nsight Systems and Nsight Compute are commonly used for AI application performance analysis and optimization. This practice describes how to use Nsight Systems for these purposes. | All | Performance analysis and optimization for AI applications using Nsight Systems | |
| ACK Edge | High-performance container networks | ACK Edge clusters now support Terway Edge as a Container Network Interface (CNI) plug-in to create high-performance underlay networks for intra-cluster communication. | All regions | Terway Edge |
October 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Cloud Controller Manager v2.10.0 | Cloud Controller Manager (CCM) v2.10.0 adds readiness gates support and allows the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-additional-resource-tags annotation to modify tags on existing load balancer instances. | All regions | Cloud Controller Manager |
| Elastic container instances for Spark jobs | Run Spark jobs on Elastic Container Instance (ECI)-based pods by configuring scheduling policies to target elastic container instances. Pay only for the resources the pods consume, reducing idle resource waste. | All regions | Use elastic container instances to run Spark jobs | |
| ACK Serverless | Custom parameter configurations for managed CoreDNS | Configure DNS settings for managed CoreDNS by defining a CustomDNSConfig custom resource (CR). | All regions | Configure custom parameters for managed CoreDNS |
| ACK One | Serverless computing in self-managed Kubernetes clusters | ACK Virtual Node lets you create serverless pods in self-managed Kubernetes clusters and access elastic cloud compute resources, including both CPUs and GPUs. | All regions | Use ACK Virtual Node for serverless computing in self-managed Kubernetes clusters |
| ALB multi-cluster gateways | ACK One ALB multi-cluster gateways extend ALB Ingress to multi-cluster mode. They work like single-cluster ALB Ingress with a few differences. | All regions | Overview of ALB multi-cluster gateways | |
| Cloud-native AI suite | Model inference optimization with TensorRT | Compile PyTorch or TensorFlow models to TensorRT format and run them in the TensorRT inference engine to improve inference speed on NVIDIA GPUs. | All regions | |
| ACK Edge | RAM Roles for Service Accounts (RRSA) | Use RAM Roles for Service Accounts (RRSA) to enforce fine-grained API permission control at the pod level, reducing security risks from shared node permissions. | All regions | Configure RRSA for service accounts to isolate permissions among pods |
| Managed node pools | ACK Edge managed node pools automate O&M tasks including OS CVE patching, kubelet updates, and node restarts, with custom O&M capabilities beyond what standard node pools offer. | All regions | Overview of managed node pools | |
| Alert configurations | Centrally manage alerts for ACK Edge clusters across multiple scenarios. Configure alert rules for key cluster resource metrics, core component metrics, and application metrics. | All regions | ||
| Kubernetes 1.28 | ACK Edge clusters now support Kubernetes 1.28. To upgrade from version 1.26 to 1.28, submit a ticket to contact the ACK technical team. Upgrades between other versions are not supported. | All regions | Update an ACK Edge cluster | |
| ACK Lingjun | Network topology-aware scheduling | Topology-aware scheduling in Lingjun clusters assigns pods to the same Layer 1 or Layer 2 forwarding domain, reducing network latency and accelerating job completion. | All regions | Work with network topology-aware scheduling |
September 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Kubernetes 1.31 support | Create new ACK clusters on Kubernetes 1.31 or upgrade existing clusters from earlier versions. | All regions | Kubernetes 1.31 |
| Deletion protection for namespaces and Services | After you enable policy governance, turn on deletion protection for business-critical namespaces or Services to prevent accidental deletion. | All regions | Enable deletion protection for a namespace or a Service | |
| Tracing for the NGINX Ingress controller | Report NGINX Ingress controller trace data to Managed Service for OpenTelemetry for real-time trace details and topology visualization. Use the monitoring data to troubleshoot and diagnose issues. | All regions | Enable tracing for the NGINX Ingress controller | |
| Cost insights for Knative Services | Enable cost insights for a Knative Service to view its estimated cost in real time and support multi-dimensional cost analysis and allocation. | All regions | Enable the cost insights feature in Knative Service | |
| Risk identification with cost insights for cluster workloads | Use cost insights to quickly surface stability, performance, and cost risks in cluster workloads. The feature tracks resource utilization, provides detailed configuration data for Burstable pods, and identifies risks in BestEffort pods. | All regions | Use cost insights to identify risks for cluster workloads | |
| Spark Operator for Spark jobs | Run and manage Spark jobs in ACK clusters using Spark Operator, giving data engineers an efficient way to handle large-scale data processing workloads. | All regions | Use Spark Operator to run Spark jobs | |
| ACK One | Argo CD alerting | Configure custom alert rules for Fleet instances using Managed Service for Prometheus metrics. Dashboards display monitoring information about Fleet instances and the GitOps system. | All regions | Configure ACK One Argo CD alerts |
| Application distribution | Distribute applications from a Fleet instance to multiple associated clusters using configurable distribution policies. Unlike GitOps, this method requires no Git repositories. Use differentiated policies to meet varying deployment requirements across clusters and applications. | All regions | Application distribution overview | |
| Access to Alibaba Cloud DNS PrivateZone | On-premises networks connected via virtual border router (VBR), IPsec-VPN, or Cloud Connect Network (CCN) can access Alibaba Cloud DNS PrivateZone through a transit router for VPC-based private domain name resolution. | All regions | Manage access to Alibaba Cloud DNS PrivateZone | |
| Statically provisioned NAS volumes in registered clusters | Mount statically provisioned NAS (Network Attached Storage) volumes to registered clusters for persistent, shared data storage across pods. | All regions | Mount a statically provisioned NAS volume | |
| Cloud-native AI suite | Auto recovery for FUSE mount targets | When a Filesystem in Userspace (FUSE) daemon crashes during a pod's lifecycle, auto recovery restores data access without restarting the application pod. | All regions | Enable the auto recovery feature for FUSE mount targets |
| Cross-namespace dataset sharing | Fluid supports data access and cache sharing across namespaces. Cache a dataset once and share it across multiple teams, improving data utilization and enabling collaboration between R&D teams. | All regions | Share datasets across namespaces | |
| ACK Edge | Edge Node Service (ENS) management | Manage Edge Node Service (ENS) instances deployed across multiple regions and ISPs in a unified, containerized manner. Create ENS disks and Edge Load Balancer instances for cloud-native storage and networking at the edge. | All regions | ENS management |
| Service topology management for node pools | Expose an application on an edge node only to the current node or nodes within the same edge node pool, preventing cross-node-group routing failures and improving response reliability. | All regions | Configure a Service topology |
August 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Inventory health status monitoring for node instant scaling | Node instant scaling now monitors ECS instance inventory health. Check the ConfigMap for inventory health status to assess the health of instance types configured for a node pool and proactively adjust instance type selections. | All regions | View the health status of node instant scaling |
| Multiple update frequencies for auto cluster update | Auto cluster update now supports three update frequency options: Latest Patch Version (patch), Second-Latest Minor Version (stable), and Latest Minor Version (rapid). | All regions | Automatically update a cluster | |
| GPU sharing and memory isolation with MPS | Multi-Process Service (MPS) enables GPU sharing and memory isolation for AI applications running Compute Unified Device Architecture (CUDA) workloads. Add specific labels to node pools in the ACK console to enable MPS mode. | All regions | Use MPS for GPU sharing and memory isolation | |
| Knative 1.12.5 | Knative 1.12.5-aliyun.7 is now supported. This version is compatible with Kourier 1.12 and adds support for Container Registry Enterprise Edition and the dashboard for preemptible ECS instances. | All regions | Knative release notes | |
| ACK One | Multi-cluster applications | Use Argo CD ApplicationSets to automatically generate one or more applications from a single orchestration template and deploy them across multiple clusters. | All regions | Create a multi-cluster application |
| Elastic node pools with custom images in registered clusters | Use custom images pre-installed with required software packages to reduce the time for on-cloud nodes to reach the Ready state and accelerate system startup. | All regions | Build an elastic node pool with a custom image | |
| Large-scale workflow creation with Argo Workflows SDK for Python (Hera) | Hera is a Python SDK for Argo Workflows that provides an alternative to YAML. Use Hera to orchestrate and test complex workflows in Python, leveraging seamless integration with the Python ecosystem. | All regions | Use Argo Workflows SDK for Python to create large-scale workflows | |
| Event-driven CI pipelines based on EventBridge | Build event-driven continuous integration (CI) pipelines using EventBridge and distributed Argo Workflows. This approach simplifies and accelerates application delivery with high elasticity and low cost. | All regions | Event-driven CI pipelines based on EventBridge | |
| Cloud-native AI suite | AI-powered Q&A assistants with Dify | Dify integrates enterprise or individual knowledge bases with large language model (LLM) applications. Use it to design customized AI-assisted Q&A solutions for your business. | All regions | Use Dify to create a customized AI-powered Q&A assistant for a website |
| Flowise installation and management | Install the Flowise component in ACK clusters. Flowise provides a drag-and-drop UI for building LLM applications in a low-code manner, enabling rapid iteration from testing to production. | All regions | ||
| Qwen2 model inference deployment with TensorRT-LLM | Deploy Qwen2 models as inference services using Triton and TensorRT-LLM. Fluid Dataflow handles data preparation during deployment, and Fluid accelerates model loading. The documented example uses the Qwen2-1.5B-Instruct model on A10 GPUs. | All regions | Use TensorRT-LLM to deploy a Qwen2 model as an inference service | |
| ACK Edge | Cloud-native AI suite support | ACK Edge clusters now support the cloud-native AI suite, including AI Dashboard and AI Developer Console for monitoring cluster status and submitting training jobs. | All regions | Deploy the cloud-native AI suite |
July 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Tracing support in NGINX Ingress controller v1.10.2-aliyun.1 | NGINX Ingress controller v1.10.2-aliyun.1 adds tracing support via Managed Service for OpenTelemetry. | All regions | Enable tracing for the NGINX Ingress controller |
| Global network policies with Poseidon v0.5.0 | Poseidon v0.5.0 introduces cluster-level global network policies, enabling network connectivity management across namespaces in ACK clusters. | All regions | Use ACK GlobalNetworkPolicy | |
| ContainerOS 3.3 | ContainerOS 3.3 updates the kernel to version 5.10.134-17.0.2.lifsea8, enables cgroup v2 by default for container resource isolation, and fixes vulnerabilities and defects. | All regions | ContainerOS image release record | |
| Custom worker RAM roles for node pools | Assign a custom Resource Access Management (RAM) worker role to a node pool at creation time. This isolates permissions per node pool and avoids all nodes in the cluster sharing the same default RAM role. | All regions | Use custom worker RAM roles | |
| ACKBlockVolumeTypes security policy | The ACKBlockVolumeTypes policy is added to the security policy library. Use it to restrict which volume types pods in specified namespaces can use. | All regions | ACKBlockVolumeTypes | |
| NVIDIA GPU driver 550.90.07 | NVIDIA GPU driver version 550.90.07 is now supported in ACK clusters. | All regions | NVIDIA driver versions supported by ACK | |
| Qwen model inference deployment with LMDeploy | Deploy Qwen models as inference services using the LMDeploy framework. The documented example uses the Qwen1.5-4B-Chat model on A10 GPUs. | All regions | Use LMDeploy to deploy the Qwen model inference service | |
| GPU-sharing inference services with KServe | Deploy inference services that share a GPU using KServe to improve GPU utilization. The documented example uses the Qwen1.5-0.5B-Chat model on a V100 GPU. | All regions | Deploy inference services that share a GPU | |
| ACK One | Event-driven CI pipelines based on EventBridge | Build event-driven CI pipelines by combining EventBridge with distributed Argo Workflows to accelerate application delivery with minimal overhead. | All regions | Event-driven CI pipelines based on EventBridge |
| Multi-cluster application orchestration through GitOps | Orchestrate multi-cluster applications in the GitOps console using Git repositories as application sources. Supports YAML manifests, Helm charts, and Kustomize for version management, multi-cluster distribution, and continuous deployment (CD). | All regions | Use an ApplicationSet to create multiple applications | |
| Elastic node pools with custom images in registered clusters | Use custom images pre-installed with required software packages to reduce node startup time and accelerate the path to Ready state. | All regions | Build an elastic node pool with a custom image | |
| Cloud-native AI suite | FUSE mount target auto repair | Fluid performs periodic polling checks and automatic repairs of FUSE mount targets, improving data access stability for business workloads. | All regions | |
| ACK Edge | Kubernetes 1.28 support | Create ACK Edge clusters running Kubernetes 1.28.9-aliyun.1. | All regions | Release notes for ACK Edge of Kubernetes 1.28 |
| Container Storage Interface (CSI) plug-in support | ACK Edge clusters support the Container Storage Interface (CSI) plug-in. Storage medium types and limitations vary by node type and integration method. | All regions | Storage overview | |
| Cloud-native AI suite support | ACK Edge clusters support all cloud-native AI suite features in on-cloud environments. Feature availability and limits in on-premises environments vary by node type and network type. | All regions | Cloud-native AI suite | |
| Ingress best practices for edge node pools | Deploy Ingress controllers in edge node pools. Note the differences in behavior compared to Ingress controllers deployed in on-cloud node pools. | All regions | Ingress overview and Use the NGINX Ingress |
June 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Kubernetes 1.30 support | Create new ACK clusters on Kubernetes 1.30 or upgrade existing clusters from earlier versions. | All regions | Kubernetes 1.30 and Manually update ACK clusters |
| Node pool OS parameter customization | Customize Linux OS parameters for node pools to improve OS performance when the defaults don't meet your business requirements. | All regions | Customize the OS parameters of a node pool | |
| Ubuntu 22.04 support | Use Ubuntu 22.04 as the node OS for ACK clusters running Kubernetes 1.30 or later. | All regions | OS images | |
| Enhanced descheduling | The Koordinator Descheduler module in the ack-koordinator component now has enhanced descheduling policies, pod eviction methods, and eviction traffic control to address imbalanced node utilization, overloaded nodes, and changing scheduling requirements. | All regions | Descheduling and Enable descheduling | |
| Network Load Balancer (NLB) configuration via Services in the ACK console | Create and manage Services in the ACK console to configure Network Load Balancer (NLB) instances. NLB is a Layer 4 load balancing service supporting up to 100 million concurrent connections with auto-scaling. | All regions | Use an existing SLB instance to expose an application and Use an automatically created SLB instance to expose an application | |
| New release of csi-provisioner | The updated csi-provisioner includes a managed version that consumes no node resources, TLS-based NAS mounting on Alibaba Cloud Linux 3, and Ubuntu node support. | All regions | csi-provisioner | |
| ACK One | Enhanced Fleet monitoring | ACK One Fleet monitoring now provides global monitoring across all associated clusters. A unified dashboard displays key component metrics, GitOps system metrics, and cost insights data for Fleet instances. | All regions | Fleet monitoring |
| Cloud-native AI suite | Cloud-native AI suite now free of charge | All cloud-native AI suite features are now free. Use them to build customized AI production systems on ACK with full-stack optimizations for AI and machine learning (ML) applications. | All regions | [Free component notice] Cloud-native AI suite is free of charge |
| ACK Edge | Disk storage for on-cloud node pools | On-cloud node pools in ACK Edge clusters now use the same Container Storage Interface (CSI) as ACK managed clusters. Mount disks using persistent volumes (PVs) and persistent volume claims (PVCs). | All regions | |
| Access to data center workloads via Express Connect circuits | The API server of an ACK Edge cluster can access pods and Services deployed at the edge using Express Connect circuits. The edge controller manager (ECM) automates routing configuration from VPCs to edge pods. | All regions | Network management |
May 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | cloud-controller-manager v2.9.1 | cloud-controller-manager v2.9.1 supports cross-VPC NLB instance reuse, NLB server group weights, and mixed ECS-plus-pod server groups. This version also improves NLB IPv6 support. | All regions | Cloud Controller Manager |
| Custom routing rules for ALB Ingresses | Create custom routing rules for ALB Ingresses using a visual interface. Route requests based on paths, domain names, or request headers, and configure actions to forward to specific Services or return fixed responses. | All regions | Customize the routing rules of an ALB Ingress | |
| NVMe disk multi-attach and reservation | Mount an NVMe disk to up to 16 instances simultaneously using the NVMe reservation feature. This ensures data consistency for applications such as databases and enables faster failovers. | All regions | Use the multi-attach and NVMe reservation features of NVMe disks | |
| ossfs version switching via feature gate | In CSI 1.30.1 and later, enable a feature gate to switch to ossfs 1.91 or later for higher file system performance. | All regions | ossfs versions and Features of ossfs 1.91 and later and ossfs performance benchmarking | |
| ACK One | CI pipelines for Golang projects in workflow clusters | ACK One workflow clusters—built on hosted Argo Workflows—provide high elasticity, auto-scaling, and zero O&M overhead. Use them to create CI pipelines for Golang projects at low cost. | All regions | Create CI pipelines for Golang projects in workflow clusters |
| Cloud-native AI suite | Dynamic dataset mount targets with Fluid | Fluid now dynamically mounts and updates dataset mount targets—including the corresponding PVs and PVCs—inside running containers without requiring a pod restart. | All regions |
April 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Anomaly diagnostics with ACK AI Assistant | ACK AI Assistant can now analyze and diagnose failed tasks, error logs, and component update failures in ACK clusters, reducing manual O&M effort. | All regions | Use ACK AI Assistant to help troubleshoot issues and find answers to your questions |
| RRSA authentication for OSS volumes | Configure RAM Roles for Service Accounts (RRSA) authentication on persistent volumes to restrict API access to specific OSS volumes, enabling fine-grained access control and improving cluster security. | All regions | Use RRSA authentication to mount a statically provisioned OSS volume | |
| EIPs with Anti-DDoS (Enhanced Edition) for pods | ACK Extend Network Controller v0.9.0 creates and manages NAT gateways and elastic IP addresses (EIPs), and can bind EIPs with Anti-DDoS (Enhanced Edition) to pods exposed to the internet. | All regions | Associate an exclusive EIP with a pod | |
| New predefined security policies | Three new predefined security policies are added to the policy governance module: ACKServicesDeleteProtection, ACKPVSizeConstraint, and ACKPVCConstraint. | All regions | Predefined security policies of ACK | |
| ACK Edge | Offline O&M tool for edge nodes | Perform O&M operations—such as business updates and configuration changes—on edge nodes that are offline due to network instability, using the ACK Edge offline O&M tool. | All regions | Offline O&M tool for edge nodes |
| ACK One | Multi-cluster gateway management | Microservices Engine (MSE) cloud-native gateways serve as multi-cluster gateways via the MSE Ingress controller hosted in ACK One. Manage north-south traffic visually, and implement active zone-redundancy, multi-cluster load balancing, and header-based traffic routing. | All regions | Manage gateways |
| OSS access optimization for distributed Argo Workflows | ACK One Argo Workflows now supports multipart upload for large files, artifact auto garbage collection, and streaming artifact transmission for more efficient and secure OSS access. | All regions | Configure artifacts | |
| Cloud-native AI suite | MLflow deployment in ACK clusters | Deploy MLflow in ACK clusters with a few clicks to track model training and manage the full ML model lifecycle, including models in MLflow Model Registry. | All regions | Configure MLflow Model Registry and Manage models in MLflow Model Registry |
March 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | Kubeconfig file management and recycle bin | View and manage issued kubeconfig files using Alibaba Cloud accounts, RAM users, or RAM roles with the required permissions. Delete or revoke permissions for kubeconfig files that pose security risks, and restore deleted files from the recycle bin within 30 days. | All regions | Use the kubeconfig recycle bin, Delete kubeconfig files, and Use ack-ram-tool to revoke the permissions of specified users on ACK clusters |
| GPU device isolation | In exclusive GPU scheduling scenarios, isolate a faulty GPU device on a node to prevent new workloads from being scheduled to it. | All regions | GPU Device Plugin-related operations | |
| Metrics collection for a specific virtual node | In clusters with multiple virtual nodes, specify a single virtual node for metrics collection. This reduces the volume of data collected at once and lowers monitoring system load when many containers run on virtual nodes. | All regions | Collect the metrics of the specified virtual node |
February 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | ACK Virtual Node 2.11.0 | ACK Virtual Node 2.11.0 adds Windows instance support and Windows node scheduling semantics. It also enables the System Operations & Maintenance (SysOM) feature for kernel-level monitoring of ECI-based pods and improves certificate generation speed during pod creation. | All regions | ACK Virtual Node and Deploy the virtual node controller and use it to create Elastic Container Instance-based pods |
| ACK One | Knative support for registered clusters | Registered clusters now support Knative, the Kubernetes-based serverless framework. Knative integrates container creation, workload management, and event models to help you build enterprise-grade serverless platforms. | All regions | Knative overview |
| Zone-disaster recovery in hybrid cloud environments | Use ACK One to implement zone-disaster recovery for Kubernetes clusters running in data centers or third-party public clouds. ACK One manages traffic, applications, and clusters centrally, routes traffic across clusters, and supports millisecond-level failovers with Layer 7 routing via the managed MSE Ingress controller. | All regions | Use MSE multi-cluster gateways to implement hybrid disaster recovery in ACK One | |
| OSS object access acceleration with Fluid in registered clusters | Use Fluid—an open source, Kubernetes-native distributed dataset orchestrator—to accelerate access to OSS files in registered clusters. | All regions | Use Fluid to accelerate access to OSS objects | |
| DingTalk chatbot notifications for GitOps application updates | Configure a DingTalk chatbot to receive notifications about GitOps application updates in multi-cluster continuous delivery scenarios. | All regions | Use a DingTalk chatbot to receive notifications about GitOps application updates | |
| Cloud-native AI suite | Ray cluster best practices | Create a Ray cluster in ACK and integrate it with Simple Log Service, Managed Service for Prometheus, and ApsaraDB for Redis for optimized logging, observability, and availability. The Ray autoscaler works with the ACK cluster autoscaler for efficient compute scaling. | All regions | Best practices for Ray clusters |
January 2024
| Product | Feature | Description | Region | References |
|---|---|---|---|---|
| Container Service for Kubernetes | ACK AI Assistant | ACK AI Assistant is built on a large language model (LLM) developed by the ACK team. It uses ACK team expertise, O&M system observability, and diagnostic experience to help you find answers and diagnose ACK and Kubernetes issues. | All regions | Use ACK AI Assistant to help troubleshoot issues and find answers to your questions |
| OS kernel-level container monitoring | Install the ALB Ingress controller and enable the Xtrace feature to collect tracing data. The Tracing Analysis service then provides trace mapping, request statistics, and trace topology for distributed applications. | All regions | Use AlbConfigs to enable Tracing Analysis based on Xtrace | |
| ACK Edge | Kubernetes 1.26 support | ACK Edge clusters now support Kubernetes 1.26, with improvements to edge node autonomy and edge node access. | All regions | Release notes for ACK Edge of Kubernetes 1.26 |
| Updated cloud-edge communication solution | ACK Edge clusters running Kubernetes 1.26 and later support network communication between on-cloud and edge node pools via Raven, which provides two modes: proxy mode for cross-domain HTTP communication between hosts, and tunnel mode for cross-domain container-to-container communication. | All regions | Cross-region O&M communication component Raven and raven-agent-ds | |
| ACK One | GitOps console access via custom domain name | Access the ACK One GitOps console through a custom domain name. Create a CNAME record mapping your custom domain to the default GitOps domain name, configure an SSL certificate, and log in with a CloudSSO account at https://<your-domain>. | All regions | Access the GitOps console through a custom domain name |
| Disaster recovery architectures for Kubernetes clusters | Design disaster recovery architectures combining ACK clusters—including third-party cloud clusters and on-premises clusters—with Alibaba Cloud networking, database, middleware, and observability services to build resilient business systems. | All regions | Disaster recovery architectures and solutions based on Kubernetes container clusters |