This topic answers frequently asked questions (FAQs) about node auto scaling and provides solutions to common issues.
Index
Category | Subcategory | Link |
Scaling behavior of node auto scaling | ||
| ||
Does the cluster-autoscaler support CustomResourceDefinitions (CRDs)? | ||
Custom scaling behavior | ||
cluster-autoscaler component | ||
Known limitations
Inability to precisely predict available node resources
The available resources on a new node may not match the instance type's specifications. This is because the underlying OS and system daemons on the ECS instance consume resources. For more information, see Why is the memory size of my ECS instance different from the size defined in its instance type? Due to this overhead, the resource estimates used by the cluster-autoscaler may be slightly higher than the actual allocatable resources on a new node. When you configure pod requests, note the following:
The total requested resources for a pod, including CPU, memory, and disk, must be less than the instance type's specifications. As a best practice, do not request more than 70% of a node's total capacity.
To determine if a node has sufficient resources, the cluster-autoscaler considers only the resource requests of Kubernetes pods, including pending pods and DaemonSet pods. If you run static pods that are not managed as DaemonSets, you must manually account for and reserve resources for them.
If a pod requests a large amount of resources, such as more than 70% of a node's resources, test in advance to confirm that the pod can be scheduled to a node of the same
instance type.
Limited support for scheduling policies
The cluster-autoscaler supports only a limited set of scheduling policies for determining if an unschedulable pod can be scheduled to a node pool with auto scaling enabled. For more information, see Supported scheduling policies.
Only resource-based ResourcePolicy is supported
When you use a ResourcePolicy to customize the priority of elastic resources, only policies of the resource type are supported. For more information, see Customize priority-based scheduling for elastic resources.
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: nginx
namespace: default
spec:
selector:
app: nginx
units:
- resource: ecs
- resource: eciCannot scale out a specific instance type
If your node pool is configured with multiple instance types, you cannot direct the cluster-autoscaler to provision a specific instance type during a scale-out. The cluster-autoscaler models the capacity of the entire node pool based on the minimum value for each resource dimension across all configured instance types. For more information, see Resource calculation in multi-type scaling groups.
Zone-dependent pods cannot trigger scale-out
If your node pool spans multiple availability zones, a pod with a dependency on a specific zone may not trigger a scale-out. This can occur if the pod requires a specific availability zone due to a PersistentVolumeClaim (PVC) bound to a volume in that zone, or a nodeSelector that targets the zone. In these cases, the cluster-autoscaler might fail to add a node in the required availability zone. For more scenarios in which the cluster-autoscaler fails to add nodes, see Reasons for scale-out failures.
Storage constraints
The node auto scaling component does not consider a pod's specific storage constraints when making decisions, such as requiring a specific availability zone or disk type (for example, ESSD) for its PersistentVolumes (PVs).
If your application has such storage dependencies, configure a dedicated node pool for it before enabling node auto scaling. Presetting the availability zone, instance type, and disk type in the node pool configuration ensures that any newly provisioned nodes meet the pod's storage mounting requirements. This prevents pod scheduling or startup failures caused by resource mismatches.
In addition, ensure that your pods do not reference a PVC that is stuck in a Terminating state. A pod whose PVC is stuck in a Terminating state will continuously fail to schedule. This can mislead the cluster-autoscaler into making incorrect scaling decisions, such as attempting to evict the pod.
Scale-out behavior
cluster-autoscalerSupported scheduling policies
The following scheduling policies are used:
If your application requires a scheduling policy not listed here, submit a request on the Connect platform. To avoid incorrect scaling, wait for the policy to be supported before using this feature.
PodFitsResources
GeneralPredicates
PodToleratesNodeTaints
MaxGCEPDVolumeCount
NoDiskConflict
CheckNodeCondition
CheckNodeDiskPressure
CheckNodeMemoryPressure
CheckNodePIDPressure
CheckVolumeBinding
MaxAzureDiskVolumeCount
MaxEBSVolumeCount
ready
NoVolumeZoneConflict
cluster-autoscalerSimulated resources for scheduling
The cluster-autoscaler can simulate and evaluate the following resources:
cpu
memory
sigma/eni
ephemeral-storage
aliyun.com/gpu-mem (shared GPUs only)
nvidia.com/gpuTo use other resource types, see Configure custom resources.
Reasons for scale-out failures
This can happen for several reasons. Review the following to diagnose the issue:
Node auto scaling only works on
node pools where it is enabled. Make sure that thenode auto scalingfeature is enabled and that thescalingmode of thenode poolis set to Auto. For more information, see Enable node auto scaling.The
instance types in thescaling groupcannot meet the pod's resource requests. The advertised specifications of an ECSinstance typerepresent its total capacity. However, a portion of these resources is reserved by ACK to ensure the stable operation of the OS kernel, system services, and essential Kubernetes daemons. This reservation creates a critical difference between the node's total capacity and its allocatable resources—the amount of CPU, memory, and storage that is actually available for pods.The standard cluster-autoscaler bases its scaling decisions on the resource reservation policy from Kubernetes v1.28 and earlier.
To apply the resource reservation policy for versions 1.28 and later, we recommend switching to Enable node instant scaling (versions 1.28 and later include a new built-in resource reservation algorithm), or manually configuring and maintaining custom resources for a node pool (by defining and maintaining resource reservation values in the node pool).
During instance creation, some resources are consumed by components such as virtualization and the operating system. For more information, see Why is the memory size of my instance different from the size defined by its instance type?.
Running components such as kubelet, kube-proxy, Terway, and Container Runtime requires some node resources. For more information, see Node resource reservation policy.
System components are installed on nodes by default. A pod's resource requests must be less than the resources defined by the instance type.
A pod with constraints on
availability zones cannot trigger the scale-out of anode poolconfigured with multipleavailability zones.The authorization steps were not completed. Authorization is a cluster-level operation and must be performed for each cluster. For more information about authorization, see the Usage notes section.
The
node poolwith autoscalingenabled experiences exceptions. The autoscaler includes a dampening mechanism to prevent repeated failures. If it provisions nodes that subsequently fail to join the cluster or remain in aNotReadystate for an extended period, it temporarily pauses furtherscalingoperations. Common exceptions include:The instance fails to join the cluster and times out.
The node is
NotReadyand times out.
This dampening approach prevents further scaling until the faulty nodes are resolved, which ensures future accuracy.
The cluster-autoscaler itself runs as a pod within the cluster. If there are no worker nodes, the autoscaler pod cannot run and therefore cannot provision new nodes. Configure your
node pools with a minimum of two nodes to ensure high availability for core cluster components.If your use case requires scaling out from zero nodes or scaling down to zero nodes, use the node instant scaling feature.
Resource calculation in multi-type scaling groups
For a scaling group with multiple instance types, the cluster-autoscaler models the group's capacity based on the minimum value for each resource (like CPU and memory) across all configured instance types.
For example, consider a scaling group configured with two instance types: one with 4 vCPU and 32 GiB of memory, and another with 8 vCPU and 16 GiB of memory. To determine the baseline for this group, the autoscaler calculates the minimums for each resource: minimum CPU is min(4, 8) = 4 vCPU, and minimum memory is min(32, 16) = 16 GiB. As a result, the autoscaler treats this entire scaling group as if it can only provision nodes with 4 vCPU and 16 GiB of memory. Therefore, if a pending pod's requests exceed 4 vCPU or 16 GiB of memory, it will not trigger a scale-out.
If you have configured multiple instance types and also need to account for resource reservation, see Reasons for scale-out failures.
Node pool selection for scale-out
When a pod is unschedulable, the cluster-autoscaler simulates which auto-scaled node pool could accommodate it. This simulation evaluates each node pool based on its configuration, including labels, taints, and available instance types. A node pool is considered a candidate for scale-out if the simulation shows that a new node from that pool would allow the pod to be scheduled. If multiple node pools meet this condition, the node auto scaling component defaults to the least-waste principle. This strategy selects the node pool that leaves the fewest unused resources after scheduling the pod.
Configure custom resources
You can configure ECS tags with a specific prefix for a node pool with scaling enabled. This allows the cluster-autoscaler to identify the custom resources available in the node pool or the precise values of specified resources.
k8s.io/cluster-autoscaler/node-template/resource/{resource_name}:{resource_size}Example:
k8s.io/cluster-autoscaler/node-template/resource/hugepages-1Gi:2GiReasons for auto scaling enablement failures
Enabling auto scaling for a node pool may fail for the following reasons:
The
node poolis the defaultnode pool. The node auto scaling feature cannot be enabled on the defaultnode pool.You cannot enable auto
scalingon anode poolthat contains manually added nodes. To resolve this, remove the manually added nodes first or create a newnode poolwith autoscalingenabled.The
node poolcontainssubscription-based instances. The node auto scaling feature does not support nodes with asubscriptionbilling method.
Scale-in behavior
Why does cluster-autoscaler fail to scale down nodes?
Several reasons can prevent the cluster-autoscaler from scaling in a node. Check for the following cases:
The
cluster-autoscalerdoes not consider a node for scale-in if the total resource requests of the pods running on it exceed the configured scale-in utilization threshold.By default, the
cluster-autoscalerdoes not remove nodes that are running pods from thekube-systemnamespace.The node cannot be scaled in if it is running a pod with a strict scheduling constraint (such as a
nodeSelectoror a strongnodeAffinity) that prevents it from being rescheduled onto any other available node in the cluster.The pods on the node have a PodDisruptionBudget, and its minimum has been reached.
For more frequently asked questions and answers from the open source community, see the cluster-autoscaler FAQ.
Control DaemonSet eviction
The cluster-autoscaler decides whether to evict DaemonSet pods based on the Evict DaemonSet Pods setting. This is a cluster-wide configuration that applies to all DaemonSet pods in the cluster. For more information, see Step 1: Enable node auto scaling for a cluster. You can override this global setting for individual DaemonSets by adding an annotation to their pods. To explicitly enable eviction for a specific DaemonSet's pods, add the "cluster-autoscaler.kubernetes.io/enable-ds-eviction": "true" annotation.
Similarly, to explicitly disable eviction for a specific DaemonSet's pods, add the annotation "cluster-autoscaler.kubernetes.io/enable-ds-eviction": "false" to the DaemonSet pod.
If Evict DaemonSet Pods is disabled, the
enable-ds-eviction: "true"annotationtakes effect only forDaemonSetpods on non-empty nodes. To enableevictionofDaemonSetpods from empty nodes, you must first enable the globalDaemonSetevictionsetting.These
annotations must be applied to the pod template in theDaemonSetmanifest, not to theDaemonSetobject itself.These
annotations have no effect on pods that are not part of aDaemonSet.By default, the
cluster-autoscalerevictsDaemonSetpods in a non-blocking manner. This means it proceeds to the next step without waiting for theDaemonSetpod'sevictionto complete. If you need thecluster-autoscalerto wait for a specificDaemonSetpod to be fully evicted before proceeding with the scale-in, add theannotation"cluster-autoscaler.kubernetes.io/wait-until-evicted":"true"to the pod in addition to theenable-ds-evictionannotation.
What types of Pods can prevent the cluster-autoscaler from removing a node?
The cluster-autoscaler cannot remove a node if it contains pods not created by a native Kubernetes controller (such as a Deployment, ReplicaSet, Job, or StatefulSet), or if pods on the node cannot be safely terminated or migrated. For a comprehensive list of conditions that can block a node scale-in, see What types of pods can prevent CA from removing a node? in the Kubernetes documentation.
Extension support
cluster-autoscalerCRD support
No, the cluster-autoscaler currently supports only standard Kubernetes objects and does not support Kubernetes CRDs.
Control scaling behavior by using pods
How to delay cluster-autoscaler's scale-up reaction time for unschedulable Pods?
You can set a scale-out delay for each pod by using the cluster-autoscaler.kubernetes.io/pod-scale-up-delay annotation. The cluster-autoscaler will only consider the pod for a scale-out if it remains unschedulable after this delay period has passed. This gives the native Kubernetes scheduler extra time to try to place the pod on existing nodes before a scale-out is triggered. Example annotation: "cluster-autoscaler.kubernetes.io/pod-scale-up-delay": "600s".
How to use Pod annotations to affect cluster-autoscaler node scale-in?
Use the pod annotation cluster-autoscaler.kubernetes.io/safe-to-evict to tell the cluster-autoscaler whether a pod can be safely evicted during a scale-in.
To prevent a node from being scaled in, add the
"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"annotationto a pod running on the node. The presence of this pod will block the node from being terminated by the autoscaler.To allow a node to be scaled in, add the
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"annotationto a pod. This explicitly marks the pod as safe to evict during a scale-in operation.
Control scaling behavior by using nodes
How to prevent cluster-autoscaler from scaling in a node?
To prevent the cluster-autoscaler from scaling in a specific node, add the "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true" annotation to it. To add this annotation, run the following kubectl command, replacing <nodename> with the name of your target node:
kubectl annotate node <nodename> cluster-autoscaler.kubernetes.io/scale-down-disabled=truecluster-autoscaler component
Upgrade the cluster-autoscaler
To upgrade the cluster-autoscaler on a cluster where auto scaling is enabled, follow these steps:
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
To the right of Edit, click Node Scaling. In the panel that appears, click OK to upgrade the component to the latest version.
What operations trigger cluster-autoscaler to update automatically?
To ensure that the cluster-autoscaler configuration is up-to-date and its version remains compatible with your cluster, the following operations trigger an automatic update or reconciliation of the cluster-autoscaler:
Updating the auto
scalingconfiguration.Creating, deleting, or updating a
node poolwhere autoscalingis enabled.Successfully upgrading the cluster's Kubernetes version.
An ACK managed cluster has been granted role permissions, but node scaling activities still fail to run properly?
This issue occurs if the required token (addon.aliyuncsmanagedautoscalerrole.token) is missing from a Secret in the kube-system namespace. By default, ACK uses the cluster's WorkerRole to enable auto scaling capabilities, and this token is essential for authenticating those operations. To resolve this, re-apply the required policy to the cluster's WorkerRole by using the authorization wizard in the ACK console.
On the ACK Clusters page, click the name of your cluster. In the left navigation pane, click .
On the Node Pools page, to the right of Node Scaling, click Enable.
Follow the on-screen instructions to grant the required permissions to the KubernetesWorkerRole role and attach the AliyunCSManagedAutoScalerRolePolicy system policy. The entry point is shown below.

Manually restart the
cluster-autoscalerDeployment (for node auto scaling) or theack-goatscalerDeployment (for node instant scaling) in thekube-systemnamespaceto apply the changes immediately.