Configure a cluster maintenance window to schedule planned operations—such as automatic Kubernetes version upgrades and CVE fixes—during off-peak hours. This ensures the stability of your core services during peak times and minimizes the potential impact of changes on your business.
Applies to: ACK managed clusters
How cluster maintenance windows work
A cluster maintenance window is a preset, recurring time period for an ACK cluster. ACK performs automated operations and maintenance (O&M) during this window, such as automatic Kubernetes version upgrades and CVE vulnerability fixes.
ACK runs two types of maintenance:
-
ACK-initiated maintenance: ACK automatically plans the execution time and sequence of O&M tasks based on task type and impact. No manual configuration is required.
-
User-configured maintenance: Set a custom maintenance window to control when ACK performs O&M operations—for example, to limit planned changes to off-peak hours.
A maintenance window defines when ACK is *allowed* to perform O&M operations. It does not guarantee that tasks run at the next available window. Final execution time depends on ACK's overall task scheduling and global grayscale orchestration rules.
The following figures show examples of a weekly maintenance window.
Default behavior:
Custom maintenance window:
Configure a maintenance window
Configure a maintenance window when you create a cluster, or modify the maintenance window of an existing cluster using any of the following methods:
-
Console: Create an ACK managed cluster
-
API: Use CreateCluster to set the window at cluster creation, or ModifyCluster to update an existing cluster.
Best practices
| Setting | Recommendation |
|---|---|
| Period | You can choose a weekly or custom maintenance period. For production environments, use a fixed maintenance period and set the start time to off-peak hours, such as 00:00–04:00. |
| Duration | Set each window to at least 4 hours. Total available maintenance time per month must be at least 48 hours to prevent long-running tasks such as upgrades from failing due to an insufficient window. |
| Time zone | Select the time zone that matches your business location to make sure the window opens at the expected local time. |
| Application high availability | Use a multi-replica deployment and distribute workloads across multiple nodes. For critical services, configure a Pod Disruption Budget (PDB) to control the number of pods that can be disrupted simultaneously. |
| Multi-cluster setups | Stagger maintenance windows across production clusters to enable grayscale upgrades between clusters and improve overall service stability. |
Operations and maintenance windows
The following table lists ACK O&M operations and whether each one is restricted by the maintenance window.
| Operation | Follows maintenance window | Notes |
|---|---|---|
| Automatic Kubernetes version upgrades for the cluster control plane | Yes | — |
| Automatic scans and fixes for CVE vulnerabilities in the node operating system | Yes | — |
| Automatic upgrades of critical system components in Auto Mode clusters | Yes | — |
| Automatic updates of the node pool image ID in Auto Mode clusters | Yes | Only newly added nodes use the new image. Existing nodes are not directly upgraded. |
Automatic responses to ECS system events (e.g., SystemMaintenance.Reboot) |
Yes (with fallback) | If a maintenance window is available before the ECS scheduled execution time, ACK performs the response during that window. Otherwise, ACK acts one hour before the ECS scheduled time. |
| Control plane repairs | No | Self-healing is triggered immediately to maintain control plane stability. |
| Node auto-healing | No | Triggered immediately when a node in a managed node pool fails. |
| Node scaling | No | Driven by real-time workload demand (CPU, memory). Independent of scheduled windows. |
| Critical security vulnerability patches | No | ACK reserves the right to bypass maintenance windows for emergency fixes to protect cluster and service security. |
FAQ
An O&M task failed. Will it be retried?
Yes. If an O&M task fails within the current maintenance window, ACK automatically retries it during the next available maintenance window.
An upgrade started during the maintenance window but didn't finish in time. What happens?
ACK handles unfinished O&M plans as follows:
-
Unstarted batches: Automatically canceled and postponed to the next maintenance window.
-
Started batches: Continue running until complete to maintain node status consistency. Once the current batch finishes, all remaining unstarted batches are canceled.