This topic describes how to configure and use remote direct memory access (RDMA) on Lingjun nodes in an ACK managed cluster Pro for high-performance container network communication. RDMA technology significantly reduces network latency and increases throughput, making it ideal for demanding scenarios such as high-performance computing (HPC), AI training, and distributed storage.
RDMA
Remote direct memory access (RDMA) is a high-performance network communication technology that addresses the data processing latency in traditional network transmissions. RDMA transfers data directly between the memory of computers, bypassing their operating systems. This mechanism enables high-throughput, low-latency network communication, making it ideal for large-scale parallel computing clusters.
RDMA moves data directly into the memory of a remote computer over the network. This process bypasses the operating system and consumes minimal processing power. By reducing the overhead of memory copies and context switching, it frees up memory bandwidth and CPU cycles to improve application performance.
Prerequisites
In Kubernetes, a Pod can use one of two network modes:
Independent IP mode: Each Pod has a unique IP address (non-
hostNetworkmode).Shared network mode: The Pod uses the host node's network directly (
hostNetworkmode).
To use the RDMA feature for Pods in independent IP mode (non-hostNetwork), you must meet the following prerequisites:
The computing network of the Lingjun bare metal cluster hosting the Lingjun node must use IPv6.
You must select the IPv6 mode when you create the Lingjun bare metal cluster.
Procedure
Install the RDMA Device Plugin add-on.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.
On the Add-ons page, click the Others tab. Find the ack-rdma-device-plugin add-on, then follow the prompts to configure and install it.
Parameter
Description
Enable RDMA for non-hostNetwork
Controls whether to enable the RDMA feature for pods that are not in
hostNetworkmode. Valid values:False (cleared): Only pods inhostNetworkmode can use the RDMA network.True (selected): Allows pods that are not inhostNetworkmode to use the RDMA network. Before you enable this option, confirm that the Lingjun bare metal cluster associated with your ACK cluster uses IPv6. Otherwise, the RDMA configuration will not take effect.
Verify that the RDMA Device Plugin is running on each RDMA-capable Lingjun node.
kubectl get ds ack-rdma-dp-ds -n kube-systemExpected output:
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE ack-rdma-dp-ds 2 2 2 2 2 <none> xxhVerify that the node has the
rdma/hcaresource.kubectl get node e01-cn-xxxx -oyamlExpected output:
... allocatable: cpu: 189280m ephemeral-storage: "3401372677838" hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 2063229768Ki nvidia.com/gpu: "8" pods: "64" rdma/hca: 1k capacity: cpu: "192" ephemeral-storage: 3690725568Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 2112881480Ki nvidia.com/gpu: "8" pods: "64" rdma/hca: 1k ...Apply the following YAML manifest to request the
rdma/hcaresource for a pod.A request of
rdma/hca: 1is sufficient.If the RDMA Device Plugin component is not enabled to allow Pods that do not use hostNetwork mode to use RDMA, only Pods configured with
hostNetwork: truecan use the RDMA feature.
apiVersion: batch/v1 kind: Job metadata: name: hps-benchmark spec: parallelism: 1 template: spec: containers: - name: hps-benchmark image: "****" command: - sh - -c - | python /workspace/wdl_8gpu_outbrain.py resources: limits: nvidia.com/gpu: 8 rdma/hca: 1 workingDir: /root volumeMounts: - name: shm mountPath: /dev/shm securityContext: capabilities: add: - SYS_RESOURCE - IPC_LOCK restartPolicy: Never volumes: - name: shm emptyDir: medium: Memory sizeLimit: 8Gi hostNetwork: true tolerations: - operator: Exists