When you need to create many pods quickly, scaling out ECS nodes can be too slow. Reserving extra ECS nodes can also waste resources. Virtual nodes allow you to schedule pods to run directly as Elastic Container Instance (ECI) pods. This eliminates the need to reserve or maintain a fixed resource pool, which provides elasticity and saves costs.
Why use virtual nodes
What is a virtual node?
Nodes are the basic units that provide compute and storage resources to run workloads in ACK clusters. Most ACK clusters have at least one ECS node pool. After a pod is created, the kubelet schedules the pod to an ECS node in the node pool. This scheduling mode is suitable for applications with stable traffic. However, this mode cannot handle traffic spikes effectively because creating and starting ECS instances is time-consuming, even with ACK's auto scaling capabilities. Virtual nodes allow you to directly schedule pods to run on Alibaba Cloud Elastic Container Instance (ECI). This simplifies node operations and maintenance (O&M), eliminates idle nodes, and reduces costs.
Unlike ECS nodes, virtual nodes do not support custom labels, annotations, or taints.
Virtual nodes use the ack-virtual-node component to abstract computing resources. You can deploy workloads directly without managing the underlying infrastructure. The ack-virtual-node component automatically schedules application pods to run on ECI. ECI is a serverless container service where one ECI instance is equivalent to one pod. To deploy container applications on ECI, you only need to provide a container image. You pay only for the resources that your containers consume.
Benefits
Virtual nodes provide the following benefits.
O&M-free: You do not need to manage the underlying resource pool, which reduces your O&M workload. As managed resources, virtual nodes eliminate routine Kubernetes node O&M tasks, such as system upgrades and security patching.
Ultra-large capacity: Scale out to a maximum of 50,000 pods without capacity planning in advance.
ImportantIf many pods are associated with services, we recommend that you keep the number of pods below 20,000.
Elasticity in seconds: Create thousands of pods in seconds. This ensures that sudden traffic spikes do not affect your services due to pod creation delays.
Security isolation: Pods are created based on ECI. Each container instance is strongly isolated from others using lightweight virtualization and sandboxed container technology. Container instances do not affect each other.
Cost-effective: Applications are created on demand and use the pay-as-you-go billing method. You are not charged for idle resources, which eliminates waste. The serverless architecture also leads to lower O&M costs.
Scenarios
Based on their features and benefits, virtual nodes are ideal for the following scenarios.
Online services
For online services that often experience traffic spikes, such as online education and e-commerce, virtual nodes support scaling in seconds. This prevents system failures caused by delayed scaling during traffic surges and avoids resource waste.
Data processing
When you process many concurrent online data tasks, such as Spark and Presto, you are no longer limited by the cost of underlying resources, which can restrict the concurrency of your tasks. You can quickly launch thousands of pods in a short time to meet the demands of online big data processing.
AI tasks
For AI tasks that do not need to run continuously and require a large amount of computing resources, such as model training and model inference, you do not need to reserve resources. You can use resources on demand and pay by the second to reduce AI inference costs. Virtual nodes also support elasticity in seconds to respond quickly to sudden task demands.
CI/CD test environments
For batch testing tasks in your CI/CD process, such as CI packaging, stress tests, and simulation tests, you can use virtual nodes to create and release container instances at any time. This supports on-demand usage and per-second billing, which provides low-cost, large-scale resource provisioning.
Jobs and CronJobs
Tasks such as Jobs and CronJobs do not need to run continuously. After a Job is complete, it automatically terminates, and the corresponding pod is deleted. Virtual nodes support automatic billing suspension and resource release after a task is complete to avoid resource waste.
Limits
You can use virtual nodes in ACK Edge clusters of version 1.28 and later. Before you start, understand the following limits.
DaemonSets are not supported. You can replace DaemonSets with sidecar containers.
You cannot specify
HostPathorHostNetworkin podmanifests.Privileged containers are not supported. You can use a security context to add capabilities to a pod.
NoteThe privileged container feature is in internal preview. To use this feature, submit a ticket.
NodePort Services and the Session Affinity feature are not supported.
The China South Finance and Alibaba Gov Cloud regions are not supported.
Billing
Virtual nodes are free of charge. You are charged for the ECI pods that run on virtual nodes based on the billing rules of ECI. For more information, see ECI billing overview.
ECI pods use the pay-as-you-go billing method. Billing starts when a pod enters the Pending state and stops when it enters the Succeeded or Failed state. For more information, see ECI pod lifecycle.
Quick start
For more information about the basic usage of virtual nodes, see Schedule a pod to run on an ECI instance.
Related operations
When you upgrade a cluster, the system automatically checks for compatibility between the ECI Platform Version and Kubernetes. If an ECI pod uses a platform version that is incompatible with the target Kubernetes version, you must manually delete and recreate the pod before you can upgrade the cluster. Before you upgrade the cluster, make sure that the ECI Platform Version is compatible with the target Kubernetes version. For more information, see Upgrade the ECI Platform Version.
Supported operations | Description | References |
Flexibly configure pods | You can configure ECI pods in batches at the cluster level by creating an ECI Profile configuration file. This file is a ConfigMap named eci-profile. For example, you can specify a security group and a vSwitch. The vSwitch determines the zone where the ECI pod is located. After the configuration is updated, the changes take effect on new ECI pods immediately without a restart. For existing ECI pods, the changes take effect after a rolling deployment. | |
You can use pod annotations to configure some ECI features. For example, you can specify an ECI instance type, enable image caching to accelerate pod creation, assign an IPv6 address to an ECI pod, or increase the temporary storage space. | ||
Schedule pods to virtual nodes | ACK provides multiple scheduling solutions. You can specify that application pods are scheduled only to virtual nodes. You can also specify that pods are preferentially scheduled to ECS nodes that use the subscription or pay-as-you-go billing methods. If ECS node resources are insufficient, the pods are then scheduled to virtual nodes. This also allows for scale-in in reverse order. For more information about how to select a scheduling policy, see Schedule pods to virtual nodes. | |
Schedule pods to a specified OS or architecture | By default, an ACK cluster schedules workload pods to virtual nodes with the x86 architecture. If x86 node resources are insufficient, the pods wait for available x86 node resources. You can also schedule workload pods to virtual nodes that use the Arm architecture. | |
If your containers need to run in a Windows environment, you can add a Windows virtual node to the cluster and schedule pods to that node. | (Invitational Preview) Schedule pods to Windows virtual nodes | |
Virtual node best practices | You can run Job tasks on virtual nodes to handle peak pressure on cluster compute resources with minimal O&M costs. You do not need to adjust the number of nodes. | |
You can run Spark jobs on Elastic Container Instance (ECI) in an ACK cluster. Using elastic ECI resources and configuring appropriate scheduling policies, you can create ECI pods on demand and pay only for the resources that you use. This reduces costs from idle resources and lets you run Spark jobs more cost-effectively. | ||
You can use the ACK Virtual Node component to automatically inject sidecar containers only into pods that are scheduled to virtual nodes. This decouples the sidecar containers of virtual node pods from application containers. | ||
You can modify the Prometheus monitoring configuration to collect metrics from specified virtual nodes. | ||
Virtual nodes support service discovery. Intranet services, headless services, and ClusterIP services are supported. | Service discovery for virtual nodes based on Alibaba Cloud DNS PrivateZone | |
Virtual node FAQ | Frequently asked questions about using virtual nodes. |