Enable node instant scaling

更新时间:
复制 MD 格式

For large clusters or latency-sensitive workloads where node auto scaling is insufficient, enable node instant scaling for faster node provisioning and lower costs.

Before you begin

Review node scaling for these concepts:

  • How node instant scaling works

  • Benefits and supported use cases

  • Usage notes

  • To avoid unexpected charges, use pay-as-you-go instances. During scale-in, subscription instances are removed from the cluster but are not released from your account.

Prerequisites and limitations

  • An ACK managed cluster or ACK dedicated cluster running Kubernetes 1.24 or later. See Manually upgrade a cluster.

  • Auto Scaling is activated.

  • The vSwitch for the node instant scaling node pool must have sufficient IP addresses. Call DescribeVSwitchAttributes to check available IPs.

    If IP addresses are insufficient, see Expand cluster IP capacity by adding a secondary CIDR block.

  • Node instant scaling works only with Standard Scaling Mode node pools. Swift mode is not supported.

  • With ACK GOATScaler v0.5.2 or earlier, manually remove offline nodes. See FAQ.

  • In an ACK dedicated cluster, nodes must have sufficient resources to deploy or update ACK GOATScaler; otherwise, scaling may fail.

Step 1: Enable node instant scaling

To use node instant scaling, enable cluster auto scaling on the Node Pools page.

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Nodes > Node Pools.

  3. On the Node Pools page, click Enable next to Node Scaling.

  4. If prompted, activate Auto Scaling and grant the required permissions.

    • ACK managed cluster: Authorize the AliyunCSManagedAutoScalerRole role.

    • ACK dedicated cluster: Authorize the KubernetesWorkerRole role and attach the AliyunCSManagedAutoScalerRolePolicy.

      In the Node Scaling Configuration dialog box, after the precheck passes, click the RAM role link (such as KubernetesWorkerRole-xxxx) to complete authorization in the RAM console.

  5. On the Node Scaling Configuration page, set Node Scaling Plan to Instant Scaling, configure the scaling parameters, and click OK.

    The scaling component automatically triggers node scale-outs based on scheduling conditions.

    Switch the Node Scaling Plan to Auto Scaling at any time. Follow the on-screen prompts. This feature is in beta. To participate, submit a ticket.

    Parameter

    Description

    Scale-in Threshold

    The ratio of resource requests to the total resource capacity of a node in a node pool with node autoscaling enabled.

    A node is eligible for scale-in only if its CPU and memory resource utilization are both below the Scale-in Threshold.

    GPU Scale-in Threshold

    The scale-in threshold for GPU instances.

    A GPU instance is eligible for scale-in only when its CPU, memory, and GPU utilization all fall below the configured GPU Scale-in Threshold.

    Scale-in Trigger Delay

    The delay between when a node becomes eligible for scale-in and when the scale-in operation is performed. Unit: minutes. Default value: 10 minutes.

    Important

    The scaling component can perform a node scale-in only after the Scale-in Threshold condition is met and the Scale-in Trigger Delay duration has passed.

    Advanced configurations

    Parameter

    Description

    Pod Termination Timeout

    Maximum wait time for pod termination during scale-in. Unit: seconds.

    If a pod is not evicted before the timeout, the node is not released.

    Minimum Pod Replicas

    Scale-in protection threshold. Nodes with ReplicationController or ReplicaSet pods are not scaled in if the replica count falls below this value.

    Applies only to ReplicationController and ReplicaSet pods, not StatefulSet or DaemonSet.

    Enable DaemonSet Pod Eviction

    When enabled, DaemonSet pods are evicted when their node is scaled in.

    Skip nodes with pods in the kube-system namespace

    When enabled, nodes with kube-system namespace pods are excluded from scale-in.

    Note

    This does not apply to DaemonSet pods or mirror pods.

Step 2: Configure a node pool for auto scaling

Node instant scaling applies only to node pools with auto scaling enabled.

Recommended: Configure multiple instance types and availability zones to ensure sufficient capacity during scale-outs.

(Optional) Step 3: Verify the configuration

Verify that auto scaling is active and ACK GOATScaler is installed.

Verify node pool status

On the Node Pools page, ensure your node pool displays the Auto Scaling tag.

Verify add-on installation

  1. On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.

  2. On the Add-ons page, ensure the ACK GOATScaler add-on status is Installed.

Node instant scaling key events

The node instant scaling feature generates these events:

Event name

Object

Description

ProvisionNode

pod

Node scale-out triggered successfully.

ProvisionNodeFailed

pod

Node scale-out failed to trigger.

ResetPod

pod

An unschedulable pod that triggered a scale-out is requeued for retry.

InstanceInventoryStatusChanged

ACKNodePool

Emitted when available inventory changes for configured instance types.

See View the health status of node instant scaling.

Node instant scaling identifiers

These identifiers are system-managed. Modifying them may cause unexpected scaling behavior.

Node labels

  • goatscaler.io/managed:true or k8s.aliyun.com: true: Identifies nodes managed by node instant scaling. Used to evaluate scale-in conditions.

  • goatscaler.io/provision-task-id:{task-id}: The scale-out task ID. Used for tracing.

Node taints

  • goatscaler.io/node-terminating: Nodes with this taint are marked for scale-in.

Pod annotations

  • goatscaler.io/provision-task-id: The scale-out task ID for this pod. The system waits for the node to start before triggering more scale-outs.

  • goatscaler.io/reschedule-deadline: Pod scheduling timeout. If still unschedulable past this deadline, the pod is requeued and can trigger another scale-out.

FAQ

Category

Subcategory

Link

Node instant scaling behavior

Known limitations

Scale-out behavior

Scale-in behavior

Custom scaling behavior

Controlling scaling with pods

How can I control node scale-in by using pods?

Controlling scaling with nodes

Node instant scaling add-on

Related operations

View health status

Thenode instant scalingfeature selects instance types and availability zones based on ECS inventory. Check the node pool ConfigMap to monitor inventory health and get instance type recommendations.

See View the health status of node instant scaling.

Enable log collection

For ACK managed clusters, collect ACK GOATScaler logs from the Control Plane Component Logs page.

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Operations > Log Center.

  3. On the Control Plane Component Logs tab, click Update Component and select ACK GOATScaler.

    In the Update Component dialog box, select ack-goatscaler and click OK.

    After the update, select ACK GOATScaler from the drop-down list to view logs.

Upgrade ACK GOATScaler

Keep ACK GOATScaler updated for the latest features. See Manage components.

Skip inventory checks for private pools

If you use a private pool for guaranteed capacity, enable SkipInventoryCheck to let ACK GOATScaler bypass inventory checks and use private pool resources directly.

  1. On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.

  2. On the Core Component page, find ACK GOATScaler and click Configuration.

    Requires ACK GOATScaler v0.3.0-582e405-aliyun or later.
  3. Set SkipInventoryCheck to true.