For large clusters or latency-sensitive workloads where node auto scaling is insufficient, enable node instant scaling for faster node provisioning and lower costs.
Before you begin
Review node scaling for these concepts:
-
How node instant scaling works
-
Benefits and supported use cases
-
Usage notes
-
To avoid unexpected charges, use pay-as-you-go instances. During scale-in, subscription instances are removed from the cluster but are not released from your account.
Prerequisites and limitations
-
An ACK managed cluster or ACK dedicated cluster running Kubernetes 1.24 or later. See Manually upgrade a cluster.
-
Auto Scaling is activated.
-
The vSwitch for the node instant scaling node pool must have sufficient IP addresses. Call DescribeVSwitchAttributes to check available IPs.
If IP addresses are insufficient, see Expand cluster IP capacity by adding a secondary CIDR block.
-
Node instant scaling works only with Standard Scaling Mode node pools. Swift mode is not supported.
-
With ACK GOATScaler v0.5.2 or earlier, manually remove offline nodes. See FAQ.
-
In an ACK dedicated cluster, nodes must have sufficient resources to deploy or update ACK GOATScaler; otherwise, scaling may fail.
Step 1: Enable node instant scaling
To use node instant scaling, enable cluster auto scaling on the Node Pools page.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, click Enable next to Node Scaling.
-
If prompted, activate Auto Scaling and grant the required permissions.
-
ACK managed cluster: Authorize the AliyunCSManagedAutoScalerRole role.
-
ACK dedicated cluster: Authorize the KubernetesWorkerRole role and attach the AliyunCSManagedAutoScalerRolePolicy.
In the Node Scaling Configuration dialog box, after the precheck passes, click the RAM role link (such as
KubernetesWorkerRole-xxxx) to complete authorization in the RAM console.
-
-
On the Node Scaling Configuration page, set Node Scaling Plan to Instant Scaling, configure the scaling parameters, and click OK.
The scaling component automatically triggers node scale-outs based on scheduling conditions.
Switch the Node Scaling Plan to Auto Scaling at any time. Follow the on-screen prompts. This feature is in beta. To participate, submit a ticket.
Parameter
Description
Scale-in Threshold
The ratio of resource requests to the total resource capacity of a node in a node pool with node autoscaling enabled.
A node is eligible for scale-in only if its CPU and memory resource utilization are both below the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU instances.
A GPU instance is eligible for scale-in only when its CPU, memory, and GPU utilization all fall below the configured GPU Scale-in Threshold.
Scale-in Trigger Delay
The delay between when a node becomes eligible for scale-in and when the scale-in operation is performed. Unit: minutes. Default value: 10 minutes.
ImportantThe scaling component can perform a node scale-in only after the Scale-in Threshold condition is met and the Scale-in Trigger Delay duration has passed.
Step 2: Configure a node pool for auto scaling
Node instant scaling applies only to node pools with auto scaling enabled.
-
Create a node pool and set Scaling Mode to Auto.
-
Edit an existing node pool to change its Scaling Mode to Auto.
Recommended: Configure multiple instance types and availability zones to ensure sufficient capacity during scale-outs.
(Optional) Step 3: Verify the configuration
Verify that auto scaling is active and ACK GOATScaler is installed.
Verify node pool status
On the Node Pools page, ensure your node pool displays the Auto Scaling tag.
Verify add-on installation
On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.
-
On the Add-ons page, ensure the ACK GOATScaler add-on status is Installed.
Node instant scaling key events
The node instant scaling feature generates these events:
|
Event name |
Object |
Description |
|
ProvisionNode |
|
Node scale-out triggered successfully. |
|
ProvisionNodeFailed |
|
Node scale-out failed to trigger. |
|
ResetPod |
|
An unschedulable pod that triggered a scale-out is requeued for retry. |
|
InstanceInventoryStatusChanged |
|
Emitted when available inventory changes for configured instance types. |
Node instant scaling identifiers
These identifiers are system-managed. Modifying them may cause unexpected scaling behavior.
Node labels
-
goatscaler.io/managed:trueork8s.aliyun.com: true: Identifies nodes managed by node instant scaling. Used to evaluate scale-in conditions. -
goatscaler.io/provision-task-id:{task-id}: The scale-out task ID. Used for tracing.
Node taints
-
goatscaler.io/node-terminating: Nodes with this taint are marked for scale-in.
Pod annotations
-
goatscaler.io/provision-task-id: The scale-out task ID for this pod. The system waits for the node to start before triggering more scale-outs. -
goatscaler.io/reschedule-deadline: Pod scheduling timeout. If still unschedulable past this deadline, the pod is requeued and can trigger another scale-out.
FAQ
Related operations
View health status
Thenode instant scalingfeature selects instance types and availability zones based on ECS inventory. Check the node pool ConfigMap to monitor inventory health and get instance type recommendations.
Enable log collection
For ACK managed clusters, collect ACK GOATScaler logs from the Control Plane Component Logs page.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Control Plane Component Logs tab, click Update Component and select ACK GOATScaler.
In the Update Component dialog box, select
ack-goatscalerand click OK.After the update, select ACK GOATScaler from the drop-down list to view logs.
Upgrade ACK GOATScaler
Keep ACK GOATScaler updated for the latest features. See Manage components.
Skip inventory checks for private pools
If you use a private pool for guaranteed capacity, enable SkipInventoryCheck to let ACK GOATScaler bypass inventory checks and use private pool resources directly.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.
-
On the Core Component page, find ACK GOATScaler and click Configuration.
Requires ACK GOATScaler v0.3.0-582e405-aliyun or later.
-
Set
SkipInventoryCheckto true.