This topic answers common questions about the node instant scaling feature and provides solutions to related issues.
Index
Known limitations
Feature limitations
Node instant scaling does not support the swift mode.
A node pool can contain up to 180 nodes per scale-out batch.
Scale-in cannot be disabled for a specific cluster.
NoteTo disable scale-in for a specific node, see Node instant scaling FAQ
The node instant scaling solution does not support checking the inventory of preemptible instances. If the Billing Method of the node pool is set to preemptible instances and the option to Use Pay-as-you-go Instances When Spot Instances Are Insufficient is enabled for the node pool, the pay-as-you-go instance is scaled out even if there is sufficient inventory of preemptible instances.
Resource prediction limitations
The available resources on a newly provisioned node may not precisely match the instance type's specifications because the underlying ECS system consumes some resources. For more information, see Why is the memory size of a purchased instance different from the memory size defined in its instance type? As a result, the resource estimates from the node instant scaling add-on may be higher than the actual available resources on a node, making precise prediction difficult. When you configure pod requests, consider the following points.
-
When you configure pod requests, the total requested resources, including CPU, memory, and disk, must be less than the instance type's capacity. We recommend that the total requests do not exceed 70% of the node's resources.
-
When the node instant scaling add-on evaluates whether a node has sufficient resources, it only considers Kubernetes pods such as pending pods and DaemonSet pods. If the node runs Static Pods that are not part of a DaemonSet, you must manually reserve resources for these pods.
-
If a pod requests a large amount of resources, such as more than 70% of a node's resources, you must test in advance to confirm that the pod can be scheduled on a node of the same instance type.
Limited simulatable resource types
The node instant scaling add-on supports only a limited number of resource types for scaling simulations. For more information, see What resource types does node instant scaling use for scaling simulations?.
Storage constraints
The node auto scaling component does not consider a pod's specific storage constraints when making decisions, such as requiring a specific availability zone or disk type (for example, ESSD) for its PersistentVolumes (PVs).
If your application has such storage dependencies, configure a dedicated node pool for it before enabling node auto scaling. Presetting the availability zone, instance type, and disk type in the node pool configuration ensures that any newly provisioned nodes meet the pod's storage mounting requirements. This prevents pod scheduling or startup failures caused by resource mismatches.
Scale-out behavior
Which resource types can node instant scaling evaluate?
Scaling simulations use the following resource types:
cpu
memory
ephemeral-storage
aliyun.com/gpu-mem # Only shared GPUs are supported.
nvidia.com/gpu
Node instant scaling: Does it support scaling out appropriate instance types in a node pool based on Pod resource requests?
Yes. The autoscaler intelligently selects the most resource-efficient instance type that can satisfy the pod's requirements. For example, consider a node pool configured with two instance types: a smaller 4-core, 8 GB type and a larger 12-core, 48 GB type. If a pod requests 2 cores, node instant scaling prioritizes creating a 4-core, 8 GB node for the pod. If you later replace the 4-core, 8 GB type with an 8-core, 16 GB type, node instant scaling automatically adapts and places new pods with similar requests on 8-core, 16 GB nodes.
How does node instant scaling select an instance type by default when a node pool contains multiple instance types?
Based on the instance types configured for the node pool, node instant scaling periodically excludes instance types with insufficient inventory, then sorts the remaining instance types by the number of CPU cores, and checks them one by one to determine if they can satisfy the resource requests of unschedulable pods. Once a suitable instance type is found, node instant scaling selects this instance type and stops checking the remaining ones.
How can I track real-time changes to the instance type inventory in a node pool when using Node Instant Scaling?
Node instant scaling provides health metrics that are periodically updated to reflect inventory changes for instance types in an auto-scaling node pool. When the inventory status of an instance type changes, node instant scaling sends a Kubernetes event named InstanceInventoryStatusChanged. You can subscribe to these events to monitor the inventory health of your node pool, assess whether the current inventory is sufficient, and adjust your instance type configuration in advance. For more information, see View the health status of node instant scaling.
Preventing scale-out failures
We recommend that you use the following configurations to expand the range of available instance types:
-
Configure multiple instance types for the node pool or use a generic configuration.
-
Configure multiple zones for the node pool.
Why does node instant scaling fail to launch a node?
Check for the following issues:
-
The instance types configured for the node pool have insufficient inventory.
-
The instance types configured for the node pool cannot meet the pod's resource requests. The advertised specifications for an ECS instance type differ from the actual resources available after provisioning. The following resource reservations must be considered at runtime:
-
During instance creation, some resources are consumed by virtualization and the operating system. For more information, see Why is the memory size of a purchased instance different from the memory size defined in its instance type?
-
ACK requires a certain amount of node resources to run Kubernetes add-ons and system processes, such as kubelet, kube-proxy, Terway, and the container runtime. For more information about the reservation policy, see Node resource reservation policy.
-
By default, system add-ons are installed on nodes. The resources requested by a pod must be less than the instance specification.
-
-
You have not completed the authorization as described in Enable instant elasticity for nodes.
-
The node pool fails to scale out new instances due to underlying infrastructure issues.
To ensure the accuracy of subsequent scaling operations and system stability, the node instant scaling add-on pauses scaling operations until you resolve the issues with abnormal nodes.
How to configure custom resources for node pools that havenode instant scaling enabled?
You can configure a node instant scaling-enabled node pool with an ECS tag that has a fixed prefix. This allows the scaling add-on to identify the available custom resources in the node pool or the exact values of specified resources.
The node instant scaling add-on, ACK GOATScaler, must be v0.2.18 or later. To upgrade the add-on, see Manage add-ons.
goatscaler.io/node-template/resource/{resource-name}:{resource-size}
Example:
goatscaler.io/node-template/resource/hugepages-1Gi:2Gi
Scale-in behavior
Why does node instant scaling fail to scale in nodes?
Check for the following issues:
-
The option to scale in only empty nodes is enabled, but the target node is not empty.
-
The total resource requests of pods on the node exceed the configured scale-in utilization threshold.
-
The node is running pods from the
kube-systemnamespace. -
A pod on the node has strict scheduling requirements that prevent it from being rescheduled to other nodes.
-
A pod on the node is protected by a PodDisruptionBudget (PDB), and removing the node would violate the PDB.
-
If a node was recently added, node instant scaling will not scale it in for the first 10 minutes.
-
An offline node exists. An offline node is a running instance that does not have a corresponding node object in the cluster. Starting from v0.5.3, the node instant scaling add-on can automatically clean up these instances. For earlier versions, you must remove them manually.
Version 0.5.3 is in canary release. To request access, submit a ticket. For information about how to upgrade the add-on, see Manage add-ons.
You can check for offline nodes on the Node Pools page. Click Sync Node Pool, then click Details for the relevant node pool, and view the Nodes tab.
What types of Pods can prevent Node Instant Scaling from removing a node?
If a pod is not created by a native Kubernetes controller, such as a Deployment, ReplicaSet, Job, or StatefulSet, or if pods on a node cannot be safely terminated or migrated, node instant scaling may be unable to remove the node.
Controlling scaling with pods
How to use a pod to control node instant scaling node scale-in?
You can use the goatscaler.io/safe-to-evict pod annotation to control whether node instant scaling can scale in the node where the pod is running.
-
To prevent a node from being scaled in, add the annotation
"goatscaler.io/safe-to-evict": "false"to a pod on that node. -
To allow a node to be scaled in, add the annotation
"goatscaler.io/safe-to-evict": "true"to a pod on that node.
Controlling scaling with nodes
Node instant scalingHow do I specify which nodes to delete during a scale-in?
To force the deletion of a node, add the goatscaler.io/force-to-delete:true:NoSchedule taint. node instant scaling then immediately deletes the node without checking its status or draining pods. Use this feature with caution as it can cause service disruptions or data loss.
How to prevent a node from being scaled in by node instant scaling?
You can prevent a node from being scaled in by the node instant scaling add-on by adding the "goatscaler.io/scale-down-disabled": "true" annotation to the node. Use the following command to add the annotation:
kubectl annotate node <nodename> goatscaler.io/scale-down-disabled=true
Can node instant scaling only scale in empty nodes?
You can configure this behavior at either the node level or the cluster level. If settings are configured at both levels, the node-level setting takes precedence.
-
Node-level configuration: Add the
goatscaler.io/scale-down-only-empty:truelabel to a node to enable this behavior, or thegoatscaler.io/scale-down-only-empty:falselabel to disable it. -
Cluster-level configuration: In the ACK console, go to the Add-ons page. Find the ACK GOATScaler add-on and configure the
ScaleDownOnlyEmptyNodesparameter totrueorfalse.
Node instant scaling add-on
Are there any operations that will trigger an automatic update of the Node Instant Scaling add-on?
No. Except for system maintenance and platform upgrades, ACK does not automatically update the ACK GOATScaler add-on. You must manually upgrade it from the Add-ons page in the ACK console.
An ACK managed cluster has been granted role permissions, but node scaling activities still fail to run properly?
This issue occurs if the required token (addon.aliyuncsmanagedautoscalerrole.token) is missing from a Secret in the kube-system namespace. By default, ACK uses the cluster's WorkerRole to enable auto scaling capabilities, and this token is essential for authenticating those operations. To resolve this, re-apply the required policy to the cluster's WorkerRole by using the authorization wizard in the ACK console.
On the ACK Clusters page, click the name of your cluster. In the left navigation pane, click .
On the Node Pools page, to the right of Node Scaling, click Enable.
Follow the on-screen instructions to grant the required permissions to the KubernetesWorkerRole role and attach the AliyunCSManagedAutoScalerRolePolicy system policy. The entry point is shown below.

Manually restart the
cluster-autoscalerDeployment (for node auto scaling) or theack-goatscalerDeployment (for node instant scaling) in thekube-systemnamespaceto apply the changes immediately.