ACK GOATScaler

更新时间:
复制 MD 格式

ACK GOATScaler is ACK's next-generation elastic scaling component for real-time node elasticity. It focuses on improving scale-out success rates and providing customizable elasticity policies. This document describes the features, usage instructions, and release notes for ACK GOATScaler.

Introduction

The component continuously monitors for Pods that cannot be scheduled due to insufficient resources. It then intelligently scales out by selecting the optimal instance types across multiple availability zones based on your node pool configuration. It also identifies and scales in long-idle nodes, helping you reduce node resource costs while ensuring application elasticity.

Compared to the previous generation of real-time node elasticity, ACK GOATScaler delivers significant enhancements in scale-out success rates, customizable scale-out policies, and throughput for large-scale clusters. It is ideal for scenarios that involve node pools with multiple instance types, require low scale-out latency, or need to control the scale-out order based on business priorities.

Usage

To learn how to enable and configure real-time node elasticity in your cluster, see Enable instant node scaling.

Note

New features in this component are explicitly enabled using parameters, labels, or annotations. When not explicitly enabled, the component's behavior remains consistent with previous versions, ensuring smooth upgrades for existing clusters.

Release notes

May 2026

Version

Change

Date

Impact

v0.6.1

New features

  • Adds the batch scaling (Batch Scale) feature. The component aggregates mergeable scaling requests into batches and issues them concurrently, which significantly improves overall scaling efficiency in large-scale scaling scenarios. This feature is controlled by the scale-by-batch switch and is enabled by default. When enabled, single-node scaling is also handled by this mechanism as a special case.

Optimizations and enhancements

  • Enhanced node image matching: During scale-out, the system precisely matches the node pool image based on the image ID in the node label alibabacloud.com/os-image-id and adds recognition for system image families such as Alibaba Cloud Linux 4 to make the assessment of inventory and instance type availability more accurate. This label is maintained by the system. Do not manually modify or delete it.

  • Periodic auto-sync of node pool configurations: The component periodically syncs changes to node pool configurations, such as availability zone, instance type, image, and preemption policy. These changes take effect automatically without restarting the component, reducing the risk of configuration drift.

  • Concurrent scale-out conflict handling: The component automatically recognizes concurrent conflict error codes from Auto Scaling, such as Operation.Conflict, and performs a brief backoff on the affected node pool before retrying. This process avoids invalid requests and improves stability in high-concurrency scale-out scenarios.

Security and maintenance

  • Upgrades the component's Golang dependency to version 1.25 to improve stability.

New parameters

  • scale-by-batch: The batch scaling switch. The value can be true or false. true indicates that batch scaling is used, and false indicates that node-by-node scale-out is used. The default value is true.

  • max-nodes-per-batch: The maximum number of nodes per batch. A value of 0 indicates no limit. The value must be a non-negative integer. The default value is 500.

May 29, 2026

Batch scaling is enabled by default. Single-node scale-out is treated as a special case of batch scaling. The results of regular scaling operations remain consistent with v0.6.0. To revert to node-by-node scale-out, you can set scale-by-batch=false. Internal optimizations, such as automatic synchronization of node pool configurations, enhanced image matching, and conflict handling for concurrent scale-out operations, do not affect the existing scaling semantics.

v0.6.0

New features

  • Support for multiple instance types in a single scaling activity: Starting from v0.6.0, each scaling activity can include multiple candidate instance types (up to 20, limited by Auto Scaling). Auto Scaling then selects the optimal type based on real-time availability to create the instance. This significantly improves the scale-out success rate and reduces Pod waiting time for node pools with multiple instance types or insufficient inventory. This feature is automatically enabled for existing node pools and requires no extra configuration.

  • Customizable node pool scale-out strategy (Expander): Adds the expander parameter to configure the selection strategy among multiple candidate node pools for scale-out operations. The parameter supports two values: default (overall scoring strategy) and priority (scale-out based on user-defined node pool priorities).

    In priority mode, node pool priorities are configured in the ConfigMap named cluster-autoscaler-priority-expander in the kube-system namespace. A higher value indicates a higher priority. Matching entries must be node pool IDs (strings that start with np). Configuration example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cluster-autoscaler-priority-expander
      namespace: kube-system
    data:
      priorities: |
        # Replace with your actual node pool IDs (strings that start with np)
        10:
          - np42fa9597**********
        50:
          - npaf8e834f**********
    
  • Instance Type Priority: You can configure instance type priority for a node pool. This feature prioritizes specified instance types for scale-out when multiple instance types are used. To enable it, add the label goatscaler.io/instance-type-priority-enabled: "true" to the node pool.

Optimizations and enhancements

  • Optimized and configurable scale-out candidate sorting strategy: The default sorting logic for candidate instance types and availability zones during scale-out has been optimized. The new logic is more effective in terms of balance between zones, inventory, and resource capacity, resulting in a higher overall scale-out success rate and better resource distribution. This logic is also made configurable through the score-policies parameter, which you can adjust based on your business requirements. The dimensions are arranged in ascending order of priority: balance between zones (balance-between-zones), inventory (inventory), resource capacity (resource-capacity), and custom priority (custom-priority). Dimensions listed later have higher priority. A lower-priority dimension is considered only when the scores for all higher-priority dimensions are the same.

    Additionally, when a Pod matches a ResourcePolicy and triggers the Custom Elastic Resource Priority Scheduling policy, the system automatically enables an internal policy named resource-policy. This policy has a higher priority than all other dimensions and acts as the highest-priority sorting criterion to ensure that the scale-out selection is consistent with the resource order defined in the ResourcePolicy. This policy works in conjunction with the scheduler, takes effect automatically, and cannot be configured by using score-policies.

  • Optimized resources and throughput for large-scale clusters: This version reduces the controller's memory footprint and improves concurrent scale-out throughput to better support large-scale clusters.

  • Enhanced handling of inventory and VSwitch IP addresses: The component now checks the inventory of each instance type independently per availability zone, preventing an entire type from being mistakenly marked as unavailable due to a stockout in a single zone. When a VSwitch runs out of available IP addresses, it enters a cooldown state faster to reduce invalid retries.

  • More detailed scale-out results: The scale-out API response now includes more detailed execution information for easier troubleshooting and verification.

New parameters

  • expander: The policy for selecting a node pool from multiple candidate node pools during a scale-out. The value can be default or priority. default selects a node pool based on a comprehensive scoring policy. priority selects a node pool based on user-defined node pool priorities. The default value is default.

  • score-policies: The sorting policy for scale-out candidates, which controls the sorting dimensions and their priorities. The later a dimension appears in the list, the higher its weight. The value is a comma-separated combination of the following dimensions: balance-between-zones (balance between zones), inventory, resource-capacity (resource capacity), and custom-priority (custom priority). If this parameter is not set, a default combination is used.

May 27, 2026

Multi-instance type scale-out is an internal enhancement that only improves the scale-out success rate without changing the instance type range or configuration of the node pool. The default sorting of scale-out candidates is optimized for general scenarios. Existing clusters automatically benefit from this improvement after an upgrade, with no configuration required. If you have specific sorting requirements, you can use score-policies to make adjustments as needed.