Alibaba Cloud Container Compute Service (ACS) is integrated with Container Service for Kubernetes. You can use an ACK Pro cluster to quickly access the container computing power provided by ACS. This topic describes how to use ACS computing power in your ACK cluster.
How it works
Container Compute Service (ACS) is a container service that uses Kubernetes as its user interface and provides computing resources that comply with container specifications. ACS uses a layered architecture that separates the Kubernetes control plane from the underlying container computing power. The ACS computing resource layer is responsible for scheduling and allocating resources for Pods, while Kubernetes manages application workloads such as Deployments, Services, StatefulSets, and CronJobs on top of this layer.
You can connect ACS container computing power to a Kubernetes cluster as a virtual node. This gives your cluster powerful elasticity, unconstrained by the computing capacity of its nodes. When ACS takes over the management of the underlying infrastructure for Pods, Kubernetes no longer needs to directly handle the placement and startup of individual Pods or monitor the resource status of underlying virtual machines. ACS ensures the required Pod resources are always available.
Container Service for Kubernetes (ACK) is one of the world's first certified Kubernetes platforms and provides a high-performance management service for containerized applications. It integrates with Alibaba Cloud's virtualization, storage, networking, and security capabilities to simplify cluster creation and scaling, allowing you to focus on developing and managing your containerized applications.
In an ACK Pro cluster, you must manually deploy a virtual node before creating ACS Pods. When your cluster needs to scale out, you can create ACS Pods on the virtual node on demand, without planning for node capacity. ACS Pods can communicate with Pods on regular cluster nodes. For workloads that are long-running and have fluctuating traffic, we recommend scheduling them to the virtual node. This approach maximizes resource utilization, shortens scale-out times, and reduces costs. When traffic decreases, you can quickly release these Pods to lower costs. Each Pod on a virtual node runs as an ACS instance within a secure and isolated container environment. For more information, see Overview of ACK.
Prerequisites
-
If this is your first time using the service, activate the required services and grant the necessary permissions:
-
Activate Container Service for Kubernetes, grant permissions to the default roles, and activate the required cloud services. For more information, see Create an ACK Pro cluster.
-
Log on to the Container Compute Service console and follow the on-screen instructions to activate ACS.
-
-
An ACK Pro cluster that runs Kubernetes 1.26 or later is required. For more information, see Create an ACK Pro cluster. For information about how to upgrade a cluster, see Upgrade an ACK cluster.
-
For ACK Pro clusters, the virtual node component (ACK Virtual Node) must meet the version requirement that corresponds to the Kubernetes version.
Kubernetes version
ACK Virtual Node component version
1.26 or later
v2.13.0 or later
Install the ACK Virtual Node component
Perform the following steps:
Log on to the ACK console. In the left navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.
-
On the Core Components tab, find the ACK Virtual Node component and click Install or Upgrade to the required version.
You can also navigate to the Component Management page by choosing Operations > Component Management in the left-side navigation pane of the cluster details page.
-
If you are prompted to Activate and Authorize ACS before you install ACK Virtual Node, follow the on-screen instructions. After you activate ACS and grant the required permissions, click OK to proceed with the installation.
-
After the installation is complete, choose in the left-side navigation pane. The name of the new virtual node starts with
virtual-kubelet-by default.
Example: Use ACS CPU computing power
After the ACK Virtual Node component is installed or upgraded to the version specified in the Prerequisites, it will support both ACS and ECI computing power.
When you schedule a Pod to a virtual node, Elastic Container Instance (ECI) computing power is used by default unless you specify ACS.
To use ACS CPU computing power in ACK, perform the following steps:
-
Schedule Pods to the virtual node by using methods such as nodeSelector, affinity, ResourcePolicy, or by adding the
alibabacloud.com/acs: "true"label. For more information, see Node affinity.NoteScheduling by using the
alibabacloud.com/acs: "true"label is not supported in ACK Serverless clusters. It is currently supported in ACK Pro clusters, ACK dedicated clusters, ACK One registered clusters, and ACK Edge clusters. -
Specify the instance type for the ACS Pod by using the label
alibabacloud.com/compute-class:<compute-type>. For more information about ACS instance types, see ACS Pod instances.
The following steps provide a detailed example:
-
Deploy a Deployment.
ImportantIf you schedule Pods by adding the
alibabacloud.com/acs: "true"label, StorageClasses of theWaitForFirstConsumertype are not supported. Therefore, when you use ACS computing power in an ACK cluster and an ACS Pod needs to mount a cloud disk, schedule the Pod to a virtual node by using nodeSelector or ResourcePolicy. For more information about how to configure ResourcePolicy, see ACK Pro clusters support hybrid scheduling of ECS and ACS computing power.NodeSelector
-
Run the following command to view the labels of the virtual node. Replace
virtual-kubelet-cn-hangzhou-kwith your virtual node's name.kubectl get node virtual-kubelet-cn-hangzhou-k -oyamlThe following output is a snippet of the
labelssection:apiVersion: v1 kind: Node metadata: labels: kubernetes.io/arch: amd64 kubernetes.io/hostname: virtual-kubelet-cn-hangzhou-k kubernetes.io/os: linux kubernetes.io/role: agent service.alibabacloud.com/exclude-node: "true" topology.diskplugin.csi.alibabacloud.com/zone: cn-hangzhou-k topology.kubernetes.io/region: cn-hangzhou topology.kubernetes.io/zone: cn-hangzhou-k type: virtual-kubelet # Use this label to schedule Pods to virtual nodes. name: virtual-kubelet-cn-hangzhou-k spec: taints: - effect: NoSchedule key: virtual-kubelet.io/provider value: alibabacloud -
Create a file named nginx.yaml with the following content to deploy two Pods.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: name: nginx labels: app: nginx alibabacloud.com/compute-class: general-purpose # Specify the compute class for the ACS Pod. Default: general-purpose. alibabacloud.com/compute-qos: default # Specify the QoS class for the ACS Pod. Default: default. spec: nodeSelector: type: virtual-kubelet # Schedule Pods to a virtual node. tolerations: - key: "virtual-kubelet.io/provider" # Tolerate the taint on the virtual node. operator: "Exists" effect: "NoSchedule" containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 resources: limits: cpu: 2 requests: cpu: 2 -
Create the NGINX application and check the deployment result.
-
Run the following command to create the NGINX application.
kubectl apply -f nginx.yaml -
Run the following command to check the deployment result.
kubectl get pods -o wideExpected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-9cdf7bbf9-s**** 1/1 Running 0 36s 10.0.6.68 virtual-kubelet-cn-hangzhou-j <none> <none> nginx-9cdf7bbf9-v**** 1/1 Running 0 36s 10.0.6.67 virtual-kubelet-cn-hangzhou-k <none> <none>The output shows that the
nodeSelectorscheduled the two Pods to nodes with thelabeltype=virtual-kubelet.
-
Pod label scheduling
-
Create a file named nginx.yaml with the following content.
apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx alibabacloud.com/acs: "true" # Configure the Pod to use ACS computing power. alibabacloud.com/compute-class: general-purpose # Specify the compute class for the ACS Pod. Default: general-purpose. alibabacloud.com/compute-qos: default # Specify the QoS class for the ACS Pod. Default: default. spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 resources: limits: cpu: 2 requests: cpu: 2 -
Create the NGINX application and check the deployment result.
-
Run the following command to create the NGINX application.
kubectl apply -f nginx.yaml -
Run the following command to check the deployment result.
kubectl get pods -o wideExpected output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-9cdf7bbf9-s**** 1/1 Running 0 36s 10.0.6.68 virtual-kubelet-cn-hangzhou-j <none> <none> nginx-9cdf7bbf9-v**** 1/1 Running 0 36s 10.0.6.67 virtual-kubelet-cn-hangzhou-k <none> <none>The output shows that the Pods are scheduled to virtual nodes, as specified by the
alibabacloud.com/acs: "true"label.
-
-
-
Check the details of an NGINX Pod to confirm that it is an ACS Pod instance.
-
Run the following command to view the details of an NGINX Pod.
kubectl describe pod nginx-9cdf7bbf9-s****Expected output (key information):
Annotations: ProviderCreate: done alibabacloud.com/client-token: edf29202-54ac-438e-9626-a1ca007xxxxx alibabacloud.com/instance-id: acs-2ze008giupcyaqbxxxxx alibabacloud.com/pod-ephemeral-storage: 30Gi alibabacloud.com/pod-use-spec: 2-4Gi alibabacloud.com/request-id: A0EF3BF3-37E7-5A07-AC2D-68A0CFCxxxxx alibabacloud.com/schedule-result: finished alibabacloud.com/user-id: 14889995898xxxxx kubernetes.io/pod-stream-port: 10250 kubernetes.io/preferred-scheduling-node: virtual-kubelet-cn-hangzhou-j/1 kubernetes.io/resource-type: serverlessThe
alibabacloud.com/instance-id: acs-2ze008giupcyaqbxxxxxannotation confirms that the Pod is an ACS Pod instance.
-
Example: Use ACS GPU computing power
The process for using ACS GPU computing power is similar to that for ACS CPU computing power, but it requires specific component versions and some additional configuration.
Component configuration
For ACK Pro clusters of different Kubernetes versions, the kube-scheduler component must meet the following version requirements.
|
Kubernetes version |
kube-scheduler version |
|
1.26 or later |
|
Usage
...
labels:
# Declare the ACS GPU resource requirement in the labels.
alibabacloud.com/compute-class: gpu # For GPU types, use the fixed value 'gpu'.
alibabacloud.com/compute-qos: default # The QoS class. This has the same meaning as for regular ACS computing power.
alibabacloud.com/gpu-model-series: example-model # The GPU model series. Replace with your actual model, such as T4.
...
-
For more information about ACS compute classes and quality of service (QoS) classes, see Relationship between compute classes and QoS classes.
-
For available GPU models for
gpu-model-series, see Specify GPU models and driver versions for ACS GPU-accelerated Pods. -
Scheduling by using the
alibabacloud.com/acs: "true"label is not supported in ACK Serverless clusters. It is currently supported in ACK Pro clusters, ACK dedicated clusters, ACK One registered clusters, and ACK Edge clusters.
-
The following examples show three different ways to configure GPU computing power.
NodeSelector
Use the following YAML to create a GPU workload.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS attributes alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model series. Replace with your actual model, such as T4. spec: # Specify the label for the virtual node. nodeSelector: type: virtual-kubelet # Tolerate the virtual node's taint. tolerations: - key: "virtual-kubelet.io/provider" # Tolerate the taint on the virtual node. operator: "Exists" effect: "NoSchedule" containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"ResourcePolicy
Use the following YAML to create a GPU workload.
apiVersion: scheduling.alibabacloud.com/v1alpha1 kind: ResourcePolicy metadata: name: dep-rp-demo namespace: default spec: selector: app: dep-rp-demo units: - resource: acs podLabels: alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model series. Replace with your actual model, such as T4. --- apiVersion: apps/v1 kind: Deployment metadata: name: dep-rp-demo labels: app: dep-rp-demo annotations: resourcePolicy: "dep-rp-demo" # Reference the name of the ResourcePolicy. spec: replicas: 1 selector: matchLabels: app: dep-rp-demo template: metadata: labels: app: dep-rp-demo alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model series. Replace with your actual model, such as T4. spec: containers: - name: demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1"For more information about using ResourcePolicy for resource scheduling, see Custom resource priority scheduling.
Pod label scheduling
Use the following YAML to create a GPU workload.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS attributes alibabacloud.com/acs: "true" # Configure the Pod to use ACS computing power. alibabacloud.com/compute-class: gpu alibabacloud.com/compute-qos: default alibabacloud.com/gpu-model-series: example-model # The GPU model series. Replace with your actual model, such as T4. spec: containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" -
Run the following command to check the running status of the GPU workload.
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyamlExpected output (key information):
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"
Example: Use ACS GPU HPN computing power
The process for using ACS GPU HPN computing power is similar to that for ACS CPU computing power, but with the following requirements:
-
This feature is supported only in ACK Pro clusters, ACK One registered clusters, and ACK One distributed workflow Argo clusters.
-
You must purchase GPU-HPN capacity reservations in advance and associate them with your cluster.
-
The kube-scheduler version must meet the following requirements:
Kubernetes version
kube-scheduler version
1.28
v1.28.12-aliyun-6.9.3.cd73f3fe or later.
1.30
v1.30.3-aliyun.6.9.3.ce7e2faf or later.
1.31
v1.31.0-aliyun.6.9.3.051bb0e8 or later.
1.32
v1.32.0-aliyun.6.9.3.515ac311 or later.
1.33
v1.33.0-aliyun.6.9.4.8b58e6b4 or later.
-
The ACK Virtual Node component must be v2.15.0 or later.
Usage
...
labels:
# Declare the ACS GPU resource requirement in the labels.
alibabacloud.com/compute-class: gpu-hpn # Must be set to gpu-hpn.
alibabacloud.com/compute-qos: default # The QoS class. This has the same meaning as for regular ACS computing power.
...
-
For more information about ACS compute classes and QoS classes, see Relationship between compute classes and QoS classes.
-
For information about other parameters for ACS Pods, see Configure an ACS Pod.
-
An ACS GPU HPN node can only schedule Pods of the
gpu-hpncompute class. You do not need to specify GPU resource requirements in the Pod resource declaration for these Pods. The node cannot schedule Pods of other compute classes or Pods for which no compute class is declared.
-
You can use a Kubernetes nodeSelector to schedule Pods to GPU HPN nodes.
ImportantWhen you configure an ACS GPU HPN Pod, note the following fields:
-
Specify the compute class:
alibabacloud.com/compute-class: gpu-hpn. -
Specify the reserved node label:
alibabacloud.com/node-type: reserved. -
For the device resource name in the
requestsandlimitsfields of a resource specification, specify the name based on the actual device card type, such as NVIDIA or others.
apiVersion: apps/v1 kind: Deployment metadata: name: dep-node-selector-demo labels: app: node-selector-demo spec: replicas: 1 selector: matchLabels: app: node-selector-demo template: metadata: labels: app: node-selector-demo # ACS attributes alibabacloud.com/compute-class: gpu-hpn alibabacloud.com/compute-qos: default spec: # Specify the label for GPU HPN reserved nodes. nodeSelector: alibabacloud.com/node-type: reserved containers: - name: node-selector-demo image: registry-cn-hangzhou.ack.aliyuncs.com/acs/stress:v1.0.4 command: - "sleep" - "1000h" resources: limits: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Use the resource name that matches your actual GPU model. requests: cpu: 1 memory: 1Gi nvidia.com/gpu: "1" # Use the resource name that matches your actual GPU model. -
-
Check the running status of the GPU workload.
kubectl get pod node-selector-demo-9cdf7bbf9-s**** -oyamlExpected output (key information):
phase: Running resources: limits: #other resources nvidia.com/gpu: "1" requests: #other resources nvidia.com/gpu: "1"