Shared GPU scheduling lets multiple pods share a single GPU, reducing GPU resource waste and improving utilization in registered clusters.
Prerequisites
Before you begin, ensure that you have:
-
A registered cluster with your self-managed Kubernetes cluster connected. See Create a registered cluster and connect a local data center cluster.
-
kubectl access to the registered cluster. See Obtain the cluster kubeconfig and connect to the cluster using kubectl.
-
Nodes that meet the following version requirements:
Component Version requirement Kubernetes 1.22 or later Operating system CentOS 7.6 (end of maintenance), CentOS 7.7 (end of maintenance), Ubuntu 16.04, Ubuntu 18.04, Alibaba Cloud Linux 2 (end of maintenance), Alibaba Cloud Linux 3
Billing
Enable Cloud-native AI Suite before using shared GPU scheduling. For pricing details, see Billing information for the Cloud-native AI Suite.
Limitations
-
Do not set the
CpuPolicyof nodes that use shared GPU scheduling tostatic. -
DaemonSet pods for shared GPU scheduling run at lower priority and may be preempted by higher-priority pods. To prevent eviction, add
priorityClassName: system-node-criticalto the DaemonSet spec — for example, ingpushare-device-plugin-ds.
How it works
Two components work together to enable shared GPU scheduling:
-
ack-ai-installer — provides GPU memory isolation and GPU topology-aware scheduling on each node.
-
ack-co-scheduler — lets you define
ResourcePolicycustom resources and enables multi-level elastic scheduling across the cluster.
Pods request GPU memory in GiB using the aliyun.com/gpu-mem resource limit. The scheduler places pods on GPU nodes, distributing GPU memory across workloads that share the same physical GPU.
Node labels control which sharing mode applies:
| Label value | Mode | Isolation |
|---|---|---|
ack.node.gpu.schedule=share |
Shared GPU scheduling (on-premises nodes) | No memory isolation between pods |
ack.node.gpu.schedule=cgpu |
Shared GPU scheduling with memory isolation (cloud nodes) | GPU memory is isolated between pods |
Step 1: Install components
Install both components in the registered cluster.
Install ack-ai-installer
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
Click the cluster name, then click Applications > Helm in the left navigation pane.
-
On the Helm page, click Deploy, then search for and install ack-ai-installer.
Install ack-co-scheduler
-
On the Clusters page, click the cluster name, then click Add-ons in the left navigation pane.
-
On the Add-ons page, search for ack-co-scheduler and click Install in the lower-right corner of its card.
Step 2: Install the GPU resource query tool
Download kubectl-inspect-cgpu and place it in your PATH.
Linux:
wget http://aliacs-k8s-cn-beijing.oss-cn-beijing.aliyuncs.com/gpushare/kubectl-inspect-cgpu-linux -O /usr/local/bin/kubectl-inspect-cgpu
macOS:
wget http://aliacs-k8s-cn-beijing.oss-cn-beijing.aliyuncs.com/gpushare/kubectl-inspect-cgpu-darwin -O /usr/local/bin/kubectl-inspect-cgpu
Grant execute permissions:
chmod +x /usr/local/bin/kubectl-inspect-cgpu
Step 3: Create GPU nodes
Create Elastic GPU Service instances and install GPU drivers and nvidia-container-runtime. See Create and manage node pools.
If you already have GPU nodes with the environment configured, skip this step. If you need a script to install GPU drivers, see Manually upgrade GPU node drivers.
Label GPU nodes to enable shared GPU scheduling:
| Node type | Label to add | Effect |
|---|---|---|
| On-premises nodes | ack.node.gpu.schedule=share |
Enables shared GPU scheduling without memory isolation |
| Cloud nodes | ack.node.gpu.schedule=cgpu |
Enables shared GPU scheduling with memory isolation |
For on-premises nodes, add the label manually using kubectl. For cloud nodes, use the node pool label feature. See Enable scheduling.
Step 4: Deploy a workload with shared GPU scheduling
Check current GPU usage
Run the following command to view GPU usage in the cluster:
kubectl inspect cgpu
Expected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU Memory(GiB)
cn-zhangjiakou.192.168.66.139 192.168.66.139 0/15 0/15
---------------------------------------------------------------------------
Allocated/Total GPU Memory In Cluster:
0/15 (0%)
Deploy a GPU workload
-
Create a file named
GPUtest.yamlwith the following content:Field Required Description schedulerName: ack-co-schedulerYes Directs the pod to the co-scheduler for GPU-aware placement aliyun.com/gpu-memYes GPU memory request in GiB. Set this to the amount of GPU memory the pod needs apiVersion: batch/v1 kind: Job metadata: name: gpu-share-sample spec: parallelism: 1 template: metadata: labels: app: gpu-share-sample spec: schedulerName: ack-co-scheduler containers: - name: gpu-share-sample image: registry.cn-hangzhou.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5 command: - python - tensorflow-sample-code/tfjob/docker/mnist/main.py - --max_steps=100000 - --data_dir=tensorflow-sample-code/data resources: limits: # Unit: GiB. This pod requests 3 GiB of GPU memory. aliyun.com/gpu-mem: 3 workingDir: /root restartPolicy: NeverKey fields in the manifest:
-
Apply the manifest:
kubectl apply -f GPUtest.yaml -
Verify GPU memory allocation:
-
GPU0(Allocated/Total): Shows GPU memory in use on each GPU.
3/15means 3 GiB of 15 GiB is allocated. -
Allocated/Total GPU Memory In Cluster: Cluster-wide summary.
3/15 (20%)confirms the pod is scheduled and consuming GPU memory.
kubectl inspect cgpuExpected output:
NAME IPADDRESS GPU0(Allocated/Total) GPU Memory(GiB) cn-zhangjiakou.192.168.66.139 192.168.66.139 3/15 3/15 --------------------------------------------------------------------------- Allocated/Total GPU Memory In Cluster: 3/15 (20%)Check the following fields to confirm scheduling succeeded:
-
What's next
For an overview of shared GPU scheduling modes and architecture, see Shared GPU scheduling.