Use shared GPU scheduling with ack-co-scheduler-Container Service for Kubernetes(ACK)-阿里云帮助中心

Prerequisites

Before you begin, ensure that you have:

A registered cluster with your self-managed Kubernetes cluster connected. See Create a registered cluster and connect a local data center cluster.
kubectl access to the registered cluster. See Obtain the cluster kubeconfig and connect to the cluster using kubectl.

Nodes that meet the following version requirements:

Component	Version requirement
Kubernetes	1.22 or later
Operating system	CentOS 7.6 (end of maintenance), CentOS 7.7 (end of maintenance), Ubuntu 16.04, Ubuntu 18.04, Alibaba Cloud Linux 2 (end of maintenance), Alibaba Cloud Linux 3

Billing

Enable Cloud-native AI Suite before using shared GPU scheduling. For pricing details, see Billing information for the Cloud-native AI Suite.

Limitations

Do not set the CpuPolicy of nodes that use shared GPU scheduling to static.
DaemonSet pods for shared GPU scheduling run at lower priority and may be preempted by higher-priority pods. To prevent eviction, add priorityClassName: system-node-critical to the DaemonSet spec — for example, in gpushare-device-plugin-ds.

How it works

Two components work together to enable shared GPU scheduling:

ack-ai-installer — provides GPU memory isolation and GPU topology-aware scheduling on each node.
ack-co-scheduler — lets you define ResourcePolicy custom resources and enables multi-level elastic scheduling across the cluster.

Pods request GPU memory in GiB using the aliyun.com/gpu-mem resource limit. The scheduler places pods on GPU nodes, distributing GPU memory across workloads that share the same physical GPU.

Node labels control which sharing mode applies:

Label value	Mode	Isolation
`ack.node.gpu.schedule=share`	Shared GPU scheduling (on-premises nodes)	No memory isolation between pods
`ack.node.gpu.schedule=cgpu`	Shared GPU scheduling with memory isolation (cloud nodes)	GPU memory is isolated between pods

Step 1: Install components

Install both components in the registered cluster.

Install ack-ai-installer

Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
Click the cluster name, then click Applications > Helm in the left navigation pane.
On the Helm page, click Deploy, then search for and install ack-ai-installer.

Install ack-co-scheduler

On the Clusters page, click the cluster name, then click Add-ons in the left navigation pane.
On the Add-ons page, search for ack-co-scheduler and click Install in the lower-right corner of its card.

Step 2: Install the GPU resource query tool

Download kubectl-inspect-cgpu and place it in your PATH.

Linux:

wget http://aliacs-k8s-cn-beijing.oss-cn-beijing.aliyuncs.com/gpushare/kubectl-inspect-cgpu-linux -O /usr/local/bin/kubectl-inspect-cgpu

macOS:

wget http://aliacs-k8s-cn-beijing.oss-cn-beijing.aliyuncs.com/gpushare/kubectl-inspect-cgpu-darwin -O /usr/local/bin/kubectl-inspect-cgpu

Grant execute permissions:

chmod +x /usr/local/bin/kubectl-inspect-cgpu

Step 3: Create GPU nodes

Create Elastic GPU Service instances and install GPU drivers and nvidia-container-runtime. See Create and manage node pools.

If you already have GPU nodes with the environment configured, skip this step. If you need a script to install GPU drivers, see Manually upgrade GPU node drivers.

Label GPU nodes to enable shared GPU scheduling:

Node type	Label to add	Effect
On-premises nodes	`ack.node.gpu.schedule=share`	Enables shared GPU scheduling without memory isolation
Cloud nodes	`ack.node.gpu.schedule=cgpu`	Enables shared GPU scheduling with memory isolation

For on-premises nodes, add the label manually using kubectl. For cloud nodes, use the node pool label feature. See Enable scheduling.

Step 4: Deploy a workload with shared GPU scheduling

Check current GPU usage

Run the following command to view GPU usage in the cluster:

kubectl inspect cgpu

Expected output:

NAME                           IPADDRESS       GPU0(Allocated/Total)  GPU Memory(GiB)
cn-zhangjiakou.192.168.66.139  192.168.66.139  0/15                   0/15
---------------------------------------------------------------------------
Allocated/Total GPU Memory In Cluster:
0/15 (0%)

Deploy a GPU workload

Create a file named GPUtest.yaml with the following content:

Field	Required	Description
`schedulerName: ack-co-scheduler`	Yes	Directs the pod to the co-scheduler for GPU-aware placement
`aliyun.com/gpu-mem`	Yes	GPU memory request in GiB. Set this to the amount of GPU memory the pod needs

apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-share-sample
spec:
  parallelism: 1
  template:
    metadata:
      labels:
        app: gpu-share-sample
    spec:
      schedulerName: ack-co-scheduler
      containers:
      - name: gpu-share-sample
        image: registry.cn-hangzhou.aliyuncs.com/ai-samples/gpushare-sample:tensorflow-1.5
        command:
        - python
        - tensorflow-sample-code/tfjob/docker/mnist/main.py
        - --max_steps=100000
        - --data_dir=tensorflow-sample-code/data
        resources:
          limits:
            # Unit: GiB. This pod requests 3 GiB of GPU memory.
            aliyun.com/gpu-mem: 3
        workingDir: /root
      restartPolicy: Never

Key fields in the manifest:

Apply the manifest:
```
kubectl apply -f GPUtest.yaml
```
Verify GPU memory allocation:
- GPU0(Allocated/Total): Shows GPU memory in use on each GPU. 3/15 means 3 GiB of 15 GiB is allocated.
- Allocated/Total GPU Memory In Cluster: Cluster-wide summary. 3/15 (20%) confirms the pod is scheduled and consuming GPU memory.
```
kubectl inspect cgpu
```
Expected output:
```
NAME                           IPADDRESS       GPU0(Allocated/Total)  GPU Memory(GiB)
cn-zhangjiakou.192.168.66.139  192.168.66.139  3/15                   3/15
---------------------------------------------------------------------------
Allocated/Total GPU Memory In Cluster:
3/15 (20%)
```
Check the following fields to confirm scheduling succeeded:

What's next

For an overview of shared GPU scheduling modes and architecture, see Shared GPU scheduling.