Node instant scaling FAQ and solutions-Container Service for Kubernetes(ACK)-阿里云帮助中心

This topic answers common questions about the node instant scaling feature and provides solutions to related issues.

Index

Category	Subcategory	Link
Node instant scaling behavior	Known limitations
	Scale-out behavior	What resource types does node instant scaling use for scaling simulations? Does node instant scaling select an appropriate instance type from a node pool based on pod resource requests? How does node instant scaling select an instance type from a node pool with multiple types? How do I monitor the inventory of instance types in a node pool when using node instant scaling? How can I optimize the node pool configuration to avoid scale-out failures from insufficient inventory? Why does node instant scaling fail to scale out nodes? How do I configure custom resources for a node pool with node instant scaling enabled?
	Scale-in behavior	Why does node instant scaling fail to scale in nodes? What types of pods can prevent node instant scaling from removing a node?
Custom scaling behavior	Controlling scaling with pods	How can I control node scale-in by using pods?
Custom scaling behavior	Controlling scaling with nodes	How do I specify which node to delete during a scale-in? How do I prevent a specific node from being scaled in? Can node instant scaling be configured to scale in only empty nodes?
Node instant scaling add-on		Does the node instant scaling add-on update automatically? Why does node scaling fail on my ACK managed cluster after I granted the required permissions?

Known limitations

Feature limitations

Node instant scaling does not support the swift mode.
A node pool can contain up to 180 nodes per scale-out batch.
Scale-in cannot be disabled for a specific cluster.

Note
To disable scale-in for a specific node, see How to prevent a node from being scaled in by node instant scaling?
The node instant scaling solution does not check preemptible instance inventory. If the node pool's Billing Method is set to preemptible instances and Use Pay-as-you-go Instances When Spot Instances Are Insufficient is enabled, pay-as-you-go instances may be scaled out even if sufficient preemptible instances are available.

Resource prediction limitations

The available resources on a newly provisioned node may not precisely match the instance type's specifications because the underlying ECS system consumes some resources. For more information, see Why is the memory size of a purchased instance different from the memory size defined in its instance type? As a result, the resource estimates from the node instant scaling add-on may be higher than the actual available resources on a node, making precise prediction difficult. When you configure pod requests, consider the following points.

When you configure pod requests, the total requested resources, including CPU, memory, and disk, must be less than the instance type's capacity. We recommend that the total requests do not exceed 70% of the node's resources.
When the node instant scaling add-on evaluates whether a node has sufficient resources, it only considers Kubernetes pods such as pending pods and DaemonSet pods. If the node runs Static Pods that are not part of a DaemonSet, you must manually reserve resources for these pods.
If a pod requests a large amount of resources, such as more than 70% of a node's resources, you must test in advance to confirm that the pod can be scheduled on a node of the same instance type.

Limited simulatable resource types

The node instant scaling add-on supports only a limited number of resource types for scaling simulations. For more information, see What resource types does node instant scaling use for scaling simulations?.

Storage constraints are invisible to the autoscaler

The autoscaler has no awareness of pod-level storage constraints, such as:

Needing to run in a specific availability zone to access a persistent volume (PV).
Requiring a node that supports a specific disk type (such as ESSD).

Solution: Configure a dedicated node pool for applications with storage dependencies before enabling Auto Scaling. Preset the availability zone, instance type, and disk type in the node pool configuration to ensure newly provisioned nodes meet the pod's storage requirements.

Also make sure your pods do not reference a PVC in a Terminating state. A pod that cannot schedule because its PVC is terminating will fail continuously, which can cause the cluster-autoscaler to make incorrect scale-out or scale-in decisions (for example, evicting the pod).

Scale-out behavior

Which resource types can node instant scaling evaluate?

Scaling simulations use the following resource types:

cpu
memory
ephemeral-storage 
aliyun.com/gpu-mem # Only shared GPUs are supported.
nvidia.com/gpu

Node instant scaling: Does it support scaling out appropriate instance types in a node pool based on Pod resource requests?

Yes. The autoscaler intelligently selects the most resource-efficient instance type that can satisfy the pod's requirements. For example, consider a node pool configured with two instance types: a smaller 4-core, 8 GB type and a larger 12-core, 48 GB type. If a pod requests 2 cores, node instant scaling prioritizes creating a 4-core, 8 GB node for the pod. If you later replace the 4-core, 8 GB type with an 8-core, 16 GB type, node instant scaling automatically adapts and places new pods with similar requests on 8-core, 16 GB nodes.

How does node instant scaling select an instance type by default when a node pool contains multiple instance types?

Based on the instance types configured for the node pool, node instant scaling periodically excludes instance types with insufficient inventory, then sorts the remaining instance types by the number of CPU cores, and checks them one by one to determine if they can satisfy the resource requests of unschedulable pods. Once a suitable instance type is found, node instant scaling selects this instance type and stops checking the remaining ones.

How can I track real-time changes to the instance type inventory in a node pool when using Node Instant Scaling?

Node instant scaling provides health metrics that are periodically updated to reflect inventory changes for instance types in an auto-scaling node pool. When the inventory status of an instance type changes, node instant scaling sends a Kubernetes event named InstanceInventoryStatusChanged. You can subscribe to these events to monitor the inventory health of your node pool, assess whether the current inventory is sufficient, and adjust your instance type configuration in advance. For more information, see View the health status of node instant scaling.

Preventing scale-out failures

We recommend that you use the following configurations to expand the range of available instance types:

Configure multiple instance types for the node pool or use a generic configuration.
Configure multiple zones for the node pool.

Why does node instant scaling fail to launch a node?

Check for the following issues:

The instance types configured for the node pool have insufficient inventory.
The instance types configured for the node pool cannot meet the pod's resource requests. The advertised specifications for an ECS instance type differ from the actual resources available after provisioning. The following resource reservations must be considered at runtime:
- During instance creation, some resources are consumed by virtualization and the operating system. For more information, see Why is the memory size of a purchased instance different from the memory size defined in its instance type?
- ACK requires a certain amount of node resources to run Kubernetes add-ons and system processes, such as kubelet, kube-proxy, Terway, and the container runtime. For more information about the reservation policy, see Node resource reservation policy.
- By default, system add-ons are installed on nodes. The resources requested by a pod must be less than the instance specification.
You have not completed the authorization as described in Enable instant elasticity for nodes.
The node pool fails to scale out new instances due to underlying infrastructure issues.

To ensure the accuracy of subsequent scaling operations and system stability, the node instant scaling add-on pauses scaling operations until you resolve the issues with abnormal nodes.

How to configure custom resources for node pools that havenode instant scaling enabled?

You can configure a node instant scaling-enabled node pool with an ECS tag that has a fixed prefix. This allows the scaling add-on to identify the available custom resources in the node pool or the exact values of specified resources.

Note

The node instant scaling add-on, ACK GOATScaler, must be v0.2.18 or later. To upgrade the add-on, see Manage add-ons.

goatscaler.io/node-template/resource/{resource-name}:{resource-size}

Example:

goatscaler.io/node-template/resource/hugepages-1Gi:2Gi

Scale-in behavior

Why does node instant scaling fail to scale in nodes?

Check for the following issues:

The option to scale in only empty nodes is enabled, but the target node is not empty.
The total resource requests of pods on the node exceed the configured scale-in utilization threshold.
The node is running pods from the kube-system namespace.
A pod on the node has strict scheduling requirements that prevent it from being rescheduled to other nodes.
A pod on the node is protected by a PodDisruptionBudget (PDB), and removing the node would violate the PDB.
If a node was recently added, node instant scaling will not scale it in for the first 10 minutes.
An offline node exists. An offline node is a running instance that does not have a corresponding node object in the cluster. Starting from v0.5.3, the node instant scaling add-on can automatically clean up these instances. For earlier versions, you must remove them manually.
Version 0.5.3 is in canary release. To request access, submit a ticket. For information about how to upgrade the add-on, see Manage add-ons.
You can check for offline nodes on the Node Pools page. Click Sync Node Pool, then click Details for the relevant node pool, and view the Nodes tab.

What types of Pods can prevent Node Instant Scaling from removing a node?

If a pod is not created by a native Kubernetes controller, such as a Deployment, ReplicaSet, Job, or StatefulSet, or if pods on a node cannot be safely terminated or migrated, node instant scaling may be unable to remove the node.

Controlling scaling with pods

How to use a pod to control node instant scaling node scale-in?

You can use the goatscaler.io/safe-to-evict pod annotation to control whether node instant scaling can scale in the node where the pod is running.

To prevent a node from being scaled in, add the annotation "goatscaler.io/safe-to-evict": "false" to a pod on that node.
To allow a node to be scaled in, add the annotation "goatscaler.io/safe-to-evict": "true" to a pod on that node.

Controlling scaling with nodes

Node instant scalingHow do I specify which nodes to delete during a scale-in?

To force the deletion of a node, add the goatscaler.io/force-to-delete:true:NoSchedule taint. node instant scaling then immediately deletes the node without checking its status or draining pods. Use this feature with caution as it can cause service disruptions or data loss.

How to prevent a node from being scaled in by node instant scaling?

You can prevent a node from being scaled in by the node instant scaling add-on by adding the "goatscaler.io/scale-down-disabled": "true" annotation to the node. Use the following command to add the annotation:

kubectl annotate node <nodename> goatscaler.io/scale-down-disabled=true

Can node instant scaling only scale in empty nodes?

You can configure this behavior at either the node level or the cluster level. If settings are configured at both levels, the node-level setting takes precedence.

Node-level configuration: Add the goatscaler.io/scale-down-only-empty:true label to a node to enable this behavior, or the goatscaler.io/scale-down-only-empty:false label to disable it.
Cluster-level configuration: In the ACK console, go to the Add-ons page. Find the ACK GOATScaler add-on and configure the ScaleDownOnlyEmptyNodes parameter to true or false.

Node instant scaling add-on

Are there any operations that will trigger an automatic update of the Node Instant Scaling add-on?

No. Except for system maintenance and platform upgrades, ACK does not automatically update the ACK GOATScaler add-on. You must manually upgrade it from the Add-ons page in the ACK console.

Node scaling isn't working on my ACK managed cluster even though role authorization is complete

This usually means the token addon.aliyuncsmanagedautoscalerrole.token is missing from a Secret in the kube-system namespace. ACK uses the cluster's Worker Role to enable auto scaling, and this token is required for authentication.

Re-apply the required policy to the Worker Role using the ACK console:

On the ACK Clusters page, click the name of your cluster. In the left navigation pane, click Nodes > Node Pools.
On the Node Pools page, click Enable to the right of Node Scaling.
Follow the on-screen instructions to authorize the KubernetesWorkerRole and attach the AliyunCSManagedAutoScalerRolePolicy system policy.
In the Node Scaling Configuration dialog box, after the pre-check passes, click the role name link in the prompt area to go to Resource Access Management (RAM) to complete the authorization.
Manually restart the cluster-autoscaler Deployment (node auto scaling) or the ack-goatscaler Deployment (node instant scaling) in the kube-system namespace for the permissions to take effect immediately.