Serverless elasticity in virtual nodes

更新时间:
复制 MD 格式

When you need to create many pods quickly, scaling out ECS nodes can be too slow. Reserving extra ECS nodes can also waste resources. Virtual nodes allow you to schedule pods to run directly as Elastic Container Instance (ECI) pods. This eliminates the need to reserve or maintain a fixed resource pool, which provides elasticity and saves costs.

Why use virtual nodes

What is a virtual node?

Nodes are the basic units that provide compute and storage resources to run workloads in ACK clusters. Most ACK clusters have at least one ECS node pool. After a pod is created, the kubelet schedules the pod to an ECS node in the node pool. This scheduling mode is suitable for applications with stable traffic. However, this mode cannot handle traffic spikes effectively because creating and starting ECS instances is time-consuming, even with ACK's auto scaling capabilities. Virtual nodes allow you to directly schedule pods to run on Alibaba Cloud Elastic Container Instance (ECI). This simplifies node operations and maintenance (O&M), eliminates idle nodes, and reduces costs.

image
Important

Unlike ECS nodes, virtual nodes do not support custom labels, annotations, or taints.

Virtual nodes use the ack-virtual-node component to abstract computing resources. You can deploy workloads directly without managing the underlying infrastructure. The ack-virtual-node component automatically schedules application pods to run on ECI. ECI is a serverless container service where one ECI instance is equivalent to one pod. To deploy container applications on ECI, you only need to provide a container image. You pay only for the resources that your containers consume.

Benefits

Virtual nodes provide the following benefits.

  • O&M-free: You do not need to manage the underlying resource pool, which reduces your O&M workload. As managed resources, virtual nodes eliminate routine Kubernetes node O&M tasks, such as system upgrades and security patching.

  • Ultra-large capacity: Scale out to a maximum of 50,000 pods without capacity planning in advance.

    Important

    If many pods are associated with services, we recommend that you keep the number of pods below 20,000.

  • Elasticity in seconds: Create thousands of pods in seconds. This ensures that sudden traffic spikes do not affect your services due to pod creation delays.

  • Security isolation: Pods are created based on ECI. Each container instance is strongly isolated from others using lightweight virtualization and sandboxed container technology. Container instances do not affect each other.

  • Cost-effective: Applications are created on demand and use the pay-as-you-go billing method. You are not charged for idle resources, which eliminates waste. The serverless architecture also leads to lower O&M costs.

Scenarios

Based on their features and benefits, virtual nodes are ideal for the following scenarios.

  • Online services

    For online services that often experience traffic spikes, such as online education and e-commerce, virtual nodes support scaling in seconds. This prevents system failures caused by delayed scaling during traffic surges and avoids resource waste.

  • Data processing

    When you process many concurrent online data tasks, such as Spark and Presto, you are no longer limited by the cost of underlying resources, which can restrict the concurrency of your tasks. You can quickly launch thousands of pods in a short time to meet the demands of online big data processing.

  • AI tasks

    For AI tasks that do not need to run continuously and require a large amount of computing resources, such as model training and model inference, you do not need to reserve resources. You can use resources on demand and pay by the second to reduce AI inference costs. Virtual nodes also support elasticity in seconds to respond quickly to sudden task demands.

  • CI/CD test environments

    For batch testing tasks in your CI/CD process, such as CI packaging, stress tests, and simulation tests, you can use virtual nodes to create and release container instances at any time. This supports on-demand usage and per-second billing, which provides low-cost, large-scale resource provisioning.

  • Jobs and CronJobs

    Tasks such as Jobs and CronJobs do not need to run continuously. After a Job is complete, it automatically terminates, and the corresponding pod is deleted. Virtual nodes support automatic billing suspension and resource release after a task is complete to avoid resource waste.

Limits

You can use virtual nodes in ACK Edge clusters of version 1.28 and later. Before you start, understand the following limits.

  • DaemonSets are not supported. You can replace DaemonSets with sidecar containers.

  • You cannot specify HostPath or HostNetwork in pod manifests.

  • Privileged containers are not supported. You can use a security context to add capabilities to a pod.

    Note

    The privileged container feature is in internal preview. To use this feature, submit a ticket.

  • NodePort Services and the Session Affinity feature are not supported.

  • The China South Finance and Alibaba Gov Cloud regions are not supported.

Billing

Virtual nodes are free of charge. You are charged for the ECI pods that run on virtual nodes based on the billing rules of ECI. For more information, see ECI billing overview.

Note

ECI pods use the pay-as-you-go billing method. Billing starts when a pod enters the Pending state and stops when it enters the Succeeded or Failed state. For more information, see ECI pod lifecycle.

Quick start

For more information about the basic usage of virtual nodes, see Schedule a pod to run on an ECI instance.

Related operations

When you upgrade a cluster, the system automatically checks for compatibility between the ECI Platform Version and Kubernetes. If an ECI pod uses a platform version that is incompatible with the target Kubernetes version, you must manually delete and recreate the pod before you can upgrade the cluster. Before you upgrade the cluster, make sure that the ECI Platform Version is compatible with the target Kubernetes version. For more information, see Upgrade the ECI Platform Version.

Supported operations

Description

References

Flexibly configure pods

You can configure ECI pods in batches at the cluster level by creating an ECI Profile configuration file. This file is a ConfigMap named eci-profile. For example, you can specify a security group and a vSwitch. The vSwitch determines the zone where the ECI pod is located. After the configuration is updated, the changes take effect on new ECI pods immediately without a restart. For existing ECI pods, the changes take effect after a rolling deployment.

Configure eci-profile

You can use pod annotations to configure some ECI features. For example, you can specify an ECI instance type, enable image caching to accelerate pod creation, assign an IPv6 address to an ECI pod, or increase the temporary storage space.

ECI pod annotations

Schedule pods to virtual nodes

ACK provides multiple scheduling solutions. You can specify that application pods are scheduled only to virtual nodes. You can also specify that pods are preferentially scheduled to ECS nodes that use the subscription or pay-as-you-go billing methods. If ECS node resources are insufficient, the pods are then scheduled to virtual nodes. This also allows for scale-in in reverse order. For more information about how to select a scheduling policy, see Schedule pods to virtual nodes.

Schedule pods to a specified OS or architecture

By default, an ACK cluster schedules workload pods to virtual nodes with the x86 architecture. If x86 node resources are insufficient, the pods wait for available x86 node resources. You can also schedule workload pods to virtual nodes that use the Arm architecture.

Schedule pods to Arm-based virtual nodes

If your containers need to run in a Windows environment, you can add a Windows virtual node to the cluster and schedule pods to that node.

(Invitational Preview) Schedule pods to Windows virtual nodes

Virtual node best practices

You can run Job tasks on virtual nodes to handle peak pressure on cluster compute resources with minimal O&M costs. You do not need to adjust the number of nodes.

Run Job tasks based on ECI

You can run Spark jobs on Elastic Container Instance (ECI) in an ACK cluster. Using elastic ECI resources and configuring appropriate scheduling policies, you can create ECI pods on demand and pay only for the resources that you use. This reduces costs from idle resources and lets you run Spark jobs more cost-effectively.

Use elastic ECI resources to run Spark jobs

You can use the ACK Virtual Node component to automatically inject sidecar containers only into pods that are scheduled to virtual nodes. This decouples the sidecar containers of virtual node pods from application containers.

Inject a sidecar container into a pod on a virtual node

You can modify the Prometheus monitoring configuration to collect metrics from specified virtual nodes.

Collect metrics from specified virtual nodes

Virtual nodes support service discovery. Intranet services, headless services, and ClusterIP services are supported.

Service discovery for virtual nodes based on Alibaba Cloud DNS PrivateZone

Virtual node FAQ

Frequently asked questions about using virtual nodes.

Virtual node FAQ