GPU compute instances provide high performance and massive parallelism, making them ideal for large-scale parallel computing workloads. These instances improve your business's compute performance and efficiency. This topic describes the features of the ECS GPU compute instance family and lists the available instance types.
-
View instance availability by region : Instance types may vary by region. We recommend that you check the purchase availability in each region.
-
View instance type selection guide: First, determine which instance families are suitable for your business scenario. Then, use this topic to select a specific instance type.
-
View instance metric descriptions: Read this topic to understand the metrics for instance types.
-
Use the ECS Price Calculator : You can use the price calculator to estimate instance fees.
Type | Related links |
GPU-accelerated compute-optimized (gn series) | |
ECS Bare Metal Instance | |
Super Computing Cluster (SCC) | |
Not recommended (If this instance family is unavailable, we recommend one of the families listed above.) |
gn9gc, GPU-accelerated compute-optimized instance family
gn9gc is in invitational preview. To use gn9gc, submit a ticket.
-
Overview: gn9gc is Alibaba Cloud's 9th-generation cost-effective GPU cloud server instance family. It uses the latest-generation CIPU 2.0 to deliver cloud service capabilities, features high clock speed processors, and is configured with appropriate memory capacity. This instance family provides cost-effective instances for large language model (LLM) generation scenarios and video/image generation scenarios. The GPU can also directly provide graphics processing capabilities to support various rendering workloads.
-
Use cases:
-
LLM inference: The new-generation GPU delivers compute power beyond the 8th generation with significantly improved memory bandwidth. Newly supported FP4 compute comprehensively improves inference performance and cost-effectiveness. Multi-GPU parallel inference efficiency is greatly enhanced.
-
-
Compute:
-
Uses the latest CIPU 2.0 cloud processor.
-
The 2nd-generation CIPU provides higher cloud processing power with enhanced eRDMA, VPC, and EBS component capabilities. Supports containers (including but not limited to Docker, Clear Container, and Pouch).
-
-
Uses the new Blackwell architecture professional graphics card:
-
Supports OpenGL professional-grade graphics processing.
-
Supports RTX, TensorRT, and other common acceleration features, with newly upgraded FP4 support and PCIe Gen5 interconnect.
-
-
Key GPU specifications:
GPU architecture
GPU memory
Computing performance
Video encoding/decoding
Inter-GPU interconnect
Acceleration APIs
NVIDIA Blackwell
-
Capacity: 72 GB
-
Bandwidth: 1,344 GB/s
-
TF32: 126 TFLOPS
-
FP32: 52 TFLOPS
-
FP16/BF16: 266 TFLOPS
-
FP8/INT8: 530 TFLOPS
-
FP4: 970 TFLOPS
-
RT Core: 196 TFLOPS
-
3 x Video Encoder
-
3 x Video Decoder
-
PCIe interface: PCIe Gen5 x16
-
Bandwidth: 128 GB/s, P2P supported
DX12, OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, DirectCompute
-
-
-
Storage:
-
I/O optimized.
-
Supports the NVMe protocol. For more information, see NVMe protocol.
-
Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPL disks, and regional ESSDs. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information about IPv6, see IPv6.
-
Ultra-high network performance with up to 30 million PPS (8-GPU instances).
-
Supports ERI (Elastic RDMA Interface) for RDMA direct acceleration over VPC networks, with bandwidth up to 360 Gbit/s. Suitable for autonomous driving, embodied intelligence, computer vision, and traditional model training workloads.
-
Note
For more information about ERI, see Enable eRDMA on an enterprise-level instance or Enable eRDMA on a GPU-accelerated instance.
-
The following table describes the instance types in the gn9gc instance family.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU memory |
Baseline/burst bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
IPv4 addresses per ENI |
IPv6 addresses per ENI |
NIC queues (primary/secondary) |
ENIs |
Max data disks |
Max disk bandwidth (GB/s) |
|
ecs.gn9gc.4xlarge |
16 |
128 |
72 GB × 1 |
16 |
3.6 million |
30 |
30 |
8/32 |
8 |
1 |
1 |
|
ecs.gn9gc.8xlarge |
32 |
192 |
72 GB × 1 |
32 |
7.5 million |
30 |
30 |
16/64 |
8 |
1 |
1 |
|
ecs.gn9gc-2x.16xlarge |
64 |
384 |
72 GB × 2 |
65 |
15 million |
30 |
30 |
32/64 |
15 |
2 |
2 |
|
ecs.gn9gc-4x.32xlarge |
128 |
768 |
72 GB × 4 |
131 |
30 million |
50 |
50 |
64/64 |
15 |
4 |
4 |
|
ecs.gn9gc-8x.64xlarge |
256 |
1,536 |
72 GB × 8 |
204 |
30 million |
50 |
50 |
128/64 |
15 |
6 |
6 |
Images used for gn9gc instances must be in the UEFI boot mode. If you want to use a custom image, make sure that the custom image supports UEFI boot mode and that the boot mode attribute of the image is set to UEFI. For more information, see Set the boot mode of a custom image to UEFI by calling API operations.
gn8v and gn8v-tee, GPU-accelerated compute-optimized instance family
These instance families are available in select regions, including those outside the Chinese mainland. To use them, contact your Alibaba Cloud sales representative.
-
Introduction:
-
gn8v: An 8th-generation GPU-accelerated compute-optimized instance family from Alibaba Cloud for AI model training and inference on ultra-large language models (LLMs). This family provides instance types with one, two, four, or eight GPUs for various application requirements.
-
gn8v-tee: To enhance security for large model training and inference, Alibaba Cloud offers gn8v-tee, an 8th-generation instance family based on gn8v with a confidential computing feature. These instances encrypt data during GPU computation to protect your data.
-
-
Use cases:
-
Cost-effective for multi-GPU parallel inference on LLMs with more than 70 billion parameters.
-
Each GPU provides 39.5 TFLOPS of FP32 compute power and delivers outstanding performance for traditional AI model training and autonomous driving training workloads.
-
The eight GPUs support NVLink interconnectivity and are suitable for training small- to medium-sized models.
-
-
Features:
-
High-speed, large-capacity GPU memory: Each GPU is equipped with 96 GB of HBM3 GPU memory and provides up to 4 TB/s of memory bandwidth, significantly accelerating model training and inference.
-
High inter-GPU bandwidth: Multiple GPUs are interconnected with NVLink at 900 GB/s. This enables much higher efficiency for multi-GPU training and inference compared to previous-generation GPU instances.
-
LLM quantization: Supports FP8 compute power, which optimizes performance for large-scale parameter training and inference. This significantly improves training and inference speeds and reduces GPU memory usage.
-
(For gn8v-tee instances only) High security: Supports both CPU confidential computing with Intel® Trust Domain Extensions (TDX) and GPU confidential computing with NVIDIA Confidential Computing (CC). This provides end-to-end confidential computing for the entire model inference pipeline, protecting your inference data and enterprise models during model training and inference.
-
-
Compute:
-
Powered by the latest CIPU 1.0.
-
Decouples compute from storage, letting you flexibly select the storage resources you need.
-
Provides bare metal capabilities, which support peer-to-peer (P2P) communication between GPU instances, unlike traditional virtualized instances.
-
-
Powered by 4th-generation Intel® Xeon® Scalable processors with a base frequency of up to 2.8 GHz and an all-core turbo frequency of up to 3.1 GHz.
-
-
Storage:
-
I/O-optimized instance.
-
These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
-
Supported cloud disk types: elastic ephemeral disk, ESSD, ESSD AutoPL disks, and Regional ESSD. For more information about cloud disks, see block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
-
These instances support jumbo frames. For more information, see Jumbo frames.
-
Delivers ultra-high network performance with a packet forwarding rate of up to 30 million pps (on 8-GPU instances).
-
Supports elastic RDMA interface (ERI).
-
Note
For information about how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
-
-
Security: Supports the trusted computing feature (vTPM). This feature is available on gn8v instances but not on gn8v-tee instances. For more information, see Overview of trusted computing capabilities.
The following table describes the instance types in the gn8v family.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU memory |
Network bandwidth (Gbit/s) |
ENIs |
Primary ENI queues |
IPv4 addresses per ENI |
IPv6 addresses per ENI |
Max cloud disks |
Baseline IOPS |
Baseline bandwidth (GB/s) |
|
ecs.gn8v.4xlarge |
16 |
96 |
96 GB × 1 |
12 |
8 |
16 |
30 |
30 |
17 |
100,000 |
0.75 |
|
ecs.gn8v.6xlarge |
24 |
128 |
96 GB × 1 |
15 |
8 |
24 |
30 |
30 |
17 |
120,000 |
0.937 |
|
ecs.gn8v-2x.8xlarge |
32 |
192 |
96 GB × 2 |
20 |
8 |
32 |
30 |
30 |
25 |
200,000 |
1.25 |
|
ecs.gn8v-4x.8xlarge |
32 |
384 |
96 GB × 4 |
20 |
8 |
32 |
30 |
30 |
25 |
200,000 |
1.25 |
|
ecs.gn8v-2x.12xlarge |
48 |
256 |
96 GB × 2 |
25 |
8 |
48 |
30 |
30 |
33 |
300,000 |
1.50 |
|
ecs.gn8v-8x.16xlarge |
64 |
768 |
96 GB × 8 |
32 |
8 |
64 |
30 |
30 |
33 |
360,000 |
2.5 |
|
ecs.gn8v-4x.24xlarge |
96 |
512 |
96 GB × 4 |
50 |
15 |
64 |
30 |
30 |
49 |
500,000 |
3 |
|
ecs.gn8v-8x.48xlarge |
192 |
1024 |
96 GB × 8 |
100 |
15 |
64 |
50 |
50 |
65 |
1,000,000 |
6 |
The following table describes the instance types in the gn8v-tee family.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU memory |
Network bandwidth (Gbit/s) |
ENIs |
Primary ENI queues |
IPv4 addresses per ENI |
IPv6 addresses per ENI |
Max cloud disks |
Baseline IOPS |
Baseline bandwidth (GB/s) |
|
ecs.gn8v-tee.4xlarge |
16 |
96 |
96 GB × 1 |
12 |
8 |
16 |
30 |
30 |
17 |
100,000 |
0.75 |
|
ecs.gn8v-tee.6xlarge |
24 |
128 |
96 GB × 1 |
15 |
8 |
24 |
30 |
30 |
17 |
120,000 |
0.937 |
|
ecs.gn8v-tee-8x.16xlarge |
64 |
768 |
96 GB × 8 |
32 |
8 |
64 |
30 |
30 |
33 |
360,000 |
2.5 |
|
ecs.gn8v-tee-8x.48xlarge |
192 |
1024 |
96 GB × 8 |
100 |
15 |
64 |
50 |
50 |
65 |
1,000,000 |
6 |
The gn8v-tee instance family supports only Alibaba Cloud Linux 3 images. If you use a custom image built on Alibaba Cloud Linux 3 to create an instance, ensure the kernel version is 5.10.134-18 or later.
gn8is, GPU-accelerated compute-optimized instance family
This instance family is available in select regions, including those outside the Chinese mainland. To use this instance family, contact your Alibaba Cloud sales representative.
-
Introduction: gn8is is Alibaba Cloud's eighth-generation GPU-accelerated compute-optimized instance family, designed for the growing demands of AI-generated content (AIGC). Powered by the latest NVIDIA L20 GPUs, this family offers instance types with one, two, four, or eight GPUs, and various CPU-to-GPU ratios to meet diverse application needs.
-
Features:
-
Graphics processing: Powered by 4th-generation Intel® Xeon® Scalable high-frequency processors, these instances provide robust CPU compute power for 3D modeling scenarios, ensuring smoother graphics rendering and design workflows.
-
Inference tasks: Equipped with the new NVIDIA L20 GPU, each with 48 GB of GPU memory, these instances accelerate inference tasks. They support the FP8 floating-point format and can be paired with Container Service for Kubernetes (ACK) to flexibly run inference for various AIGC models. They are especially suitable for inference tasks on large language models (LLMs) with fewer than 70 billion parameters.
-
-
Use cases:
-
Use GRID drivers with images from Alibaba Cloud Marketplace to enable OpenGL and Direct3D capabilities. This provides workstation-grade graphics processing for workloads such as animation, film and television special effects, and rendering.
-
Use the container management capabilities of Container Service for Kubernetes (ACK) for more efficient and cost-effective AIGC image generation and LLM inference.
-
Other general-purpose AI applications, such as image recognition and speech recognition.
-
-
Compute:
-
Powered by the new NVIDIA L20 enterprise-grade GPUs.
-
Supports common acceleration features such as TensorRT and the FP8 floating-point format to improve model inference performance.
-
Up to 48 GB of GPU memory per GPU. With multiple GPUs, instances in this family support single-instance inference for models with 70 billion or more parameters.
-
Enhanced graphics processing capabilities. After you install a GRID driver using Cloud Assistant or an image from Alibaba Cloud Marketplace, the graphics processing performance is twice that of 7th-generation platforms.
-
-
Key parameters of the NVIDIA L20 GPU:
GPU architecture
GPU memory
Compute performance
Video encoding/decoding
Inter-GPU connectivity
NVIDIA Ada Lovelace
-
Capacity: 48 GB
-
Bandwidth: 864 GB/s
-
FP64: N/A
-
FP32: 59.3 TFLOPS
-
FP16/BF16: 119 TFLOPS
-
FP8/INT8: 237 TFLOPS
-
3 × Video Encoders (+AV1)
-
3 × Video Decoders
-
4 × JPEG Decoders
-
PCIe interface: PCIe Gen4 x16
-
Bandwidth: 64 GB/s
-
-
Processor: Powered by the latest high-frequency Intel® Xeon® processors with an all-core turbo frequency of up to 3.9 GHz to handle complex 3D modeling demands.
-
-
Storage:
-
All instances in this family are I/O-optimized instances.
-
These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
-
Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information about cloud disks, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
-
Supports Elastic RDMA Interface (ERI).
NoteFor details on using ERI, see Enable eRDMA for enterprise-level instances or Enable eRDMA for GPU-accelerated instances.
-
-
Security: These instances support the vTPM feature. For more information, see Overview of trusted computing.
The following table describes the instance types and specifications for the gn8is family.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU |
GPU memory |
Network bandwidth (Gbit/s) |
ENIs |
Primary ENI queues |
Private IPv4 addresses |
IPv6 addresses |
Max cloud disks |
Disk IOPS |
Disk bandwidth (GB/s) |
|
ecs.gn8is.2xlarge |
8 |
64 |
L20 × 1 |
48 GB × 1 |
8 |
4 |
8 |
15 |
15 |
17 |
60,000 |
0.75 |
|
ecs.gn8is.4xlarge |
16 |
128 |
L20 × 1 |
48 GB × 1 |
16 |
8 |
16 |
30 |
30 |
17 |
120,000 |
1.25 |
|
ecs.gn8is-2x.8xlarge |
32 |
256 |
L20 × 2 |
48 GB × 2 |
32 |
8 |
32 |
30 |
30 |
33 |
250,000 |
2 |
|
ecs.gn8is-4x.16xlarge |
64 |
512 |
L20 × 4 |
48 GB × 4 |
64 |
8 |
64 |
30 |
30 |
33 |
450,000 |
4 |
|
ecs.gn8is-8x.32xlarge |
128 |
1024 |
L20 × 8 |
48 GB × 8 |
100 |
15 |
64 |
50 |
50 |
65 |
900,000 |
8 |
gn7e, GPU-accelerated compute-optimized instance family
Features of the gn7e instance family include:
-
Overview:
-
this instance family lets you select instance types with different numbers of GPUs and CPU resources to meet your various AI business needs.
-
Built on the third-generation X-Dragon architecture, gn7e instances deliver double the average network bandwidth for VPCs and cloud disks compared to the previous generation.
-
-
Use cases:
-
Small- and medium-scale AI training workloads.
-
High-performance computing (HPC) workloads accelerated using CUDA.
-
AI inference workloads that require high GPU compute performance or large GPU memory.
-
Deep learning, such as training AI algorithms for image classification, autonomous driving, and speech recognition.
-
GPU-intensive scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
ImportantWhen you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
-
-
Storage:
-
All instances in this family are I/O optimized.
-
Supported cloud disk types: ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Intra-city Redundant cloud disks. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information, see IPv6 communication.
-
Network performance scales with the instance type. Larger instance types offer better network performance.
-
The gn7e instance family includes the instance types and specifications described in the following table.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU memory |
Baseline bandwidth (Gbit/s) |
Forwarding rate (pps) |
Queues |
ENIs |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn7e-c16g1.4xlarge |
16 |
125 |
80 GB × 1 |
8 |
3,000,000 |
8 |
8 |
10 |
1 |
|
ecs.gn7e-c16g1.8xlarge |
32 |
250 |
80 GB × 2 |
16 |
6,000,000 |
16 |
8 |
10 |
1 |
|
ecs.gn7e-c16g1.16xlarge |
64 |
500 |
80 GB × 4 |
32 |
12,000,000 |
32 |
8 |
10 |
1 |
|
ecs.gn7e-c16g1.32xlarge |
128 |
1000 |
80 GB × 8 |
64 |
24,000,000 |
32 |
16 |
15 |
1 |
gn7i, GPU-accelerated compute-optimized instance family
-
Overview: Powered by the third-generation SHENLONG architecture, gn7i instances deliver stable and predictable high performance. They use chip-level fast path acceleration to increase storage performance, network performance, and compute stability by an order of magnitude.
-
Use cases:
-
Equipped with high-performance CPUs, memory, and GPUs, these instances are ideal for concurrent AI inference tasks, such as image recognition, speech recognition, and behavior recognition.
-
These instances support RTX features and use high-frequency CPUs to deliver high-performance 3D graphics virtualization. They are suitable for graphics-intensive workloads, such as remote graphics design and cloud gaming.
-
-
Compute:
-
Equipped with NVIDIA A10 GPUs that feature:
-
The innovative NVIDIA Ampere architecture.
-
Support for common acceleration features such as RTX and TensorRT.
-
-
Processor: 2.9 GHz Intel ® Xeon ® Scalable (Ice Lake) processor with an all-core turbo frequency of 3.5 GHz.
-
This instance family provides up to 752 GiB of memory, a significant increase compared to the gn6i instance family.
-
-
Storage:
-
All instances in this family are I/O optimized.
-
Supported cloud disk types: ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Zone-redundant cloud disks. For more information, see Block storage overview.
-
-
Network:
-
These instances support IPv4 and IPv6. For more information, see IPv6 communication.
-
Network performance scales with the instance type. Larger instance types offer better network performance.
-
The gn7i instance family includes the following instance types and specifications.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU |
GPU memory |
Network bandwidth (Gbit/s) |
Packet rate (PPS) |
NIC queues |
ENIs |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn7i-c8g1.2xlarge |
8 |
30 |
NVIDIA A10 * 1 |
24 GB * 1 |
16 |
1,600,000 |
8 |
4 |
15 |
15 |
|
ecs.gn7i-c16g1.4xlarge |
16 |
60 |
NVIDIA A10 * 1 |
24 GB * 1 |
16 |
3,000,000 |
8 |
8 |
30 |
30 |
|
ecs.gn7i-c32g1.8xlarge |
32 |
188 |
NVIDIA A10 * 1 |
24 GB * 1 |
16 |
6,000,000 |
12 |
8 |
30 |
30 |
|
ecs.gn7i-c32g1.16xlarge |
64 |
376 |
NVIDIA A10 * 2 |
24 GB * 2 |
32 |
12,000,000 |
16 |
15 |
30 |
30 |
|
ecs.gn7i-c32g1.32xlarge |
128 |
752 |
NVIDIA A10 * 4 |
24 GB * 4 |
64 |
24,000,000 |
32 |
15 |
30 |
30 |
|
ecs.gn7i-c48g1.12xlarge |
48 |
310 |
NVIDIA A10 * 1 |
24 GB * 1 |
16 |
9,000,000 |
16 |
8 |
30 |
30 |
|
ecs.gn7i-c56g1.14xlarge |
56 |
346 |
NVIDIA A10 * 1 |
24 GB * 1 |
16 |
10,000,000 |
16 |
8 |
30 |
30 |
|
ecs.gn7i-2x.8xlarge |
32 |
128 |
NVIDIA A10 * 2 |
24 GB * 2 |
16 |
6,000,000 |
16 |
8 |
30 |
30 |
|
ecs.gn7i-4x.8xlarge |
32 |
128 |
NVIDIA A10 * 4 |
24 GB * 4 |
32 |
6,000,000 |
16 |
8 |
30 |
30 |
|
ecs.gn7i-4x.16xlarge |
64 |
256 |
NVIDIA A10 * 4 |
24 GB * 4 |
64 |
12,000,000 |
32 |
8 |
30 |
30 |
|
ecs.gn7i-8x.32xlarge |
128 |
512 |
NVIDIA A10 * 8 |
24 GB * 8 |
64 |
24,000,000 |
32 |
16 |
30 |
30 |
|
ecs.gn7i-8x.16xlarge |
64 |
256 |
NVIDIA A10 * 8 |
24 GB * 8 |
32 |
12,000,000 |
32 |
8 |
30 |
30 |
You can change instances of the types ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge. However, you cannot change them to other instance types such as ecs.gn7i-c32g1.8xlarge.
gn7s, GPU-accelerated compute-optimized instance family
To use the gn7s instance family, submit a ticket.
-
Introduction:
-
This instance family is powered by the latest Intel Ice Lake processors and NVIDIA A30 GPUs based on the NVIDIA Ampere architecture. This family offers various instance types with different GPU and CPU configurations to meet your specific AI needs.
-
Built on Alibaba Cloud's third-generation SHENLONG architecture, gn7s instances deliver twice the average network bandwidth for VPCs and cloud disks as the previous generation.
-
-
Use cases: Featuring high-performance CPUs, memory, and GPUs, these instances are ideal for concurrent AI inference workloads, such as image recognition, speech recognition, and behavior identification.
-
Compute:
-
Features NVIDIA A30 GPUs, which include:
-
The innovative NVIDIA Ampere architecture.
-
Support for the Multi-Instance GPU (MIG) feature and acceleration based on second-generation Tensor Cores for a wide range of workloads.
-
-
Processor: 2.9 GHz Intel ® Xeon ® Scalable (Ice Lake) processor with an all-core turbo frequency of 3.5 GHz.
-
Offers significantly more memory than the previous-generation instance family.
-
-
Storage:
-
All instances in this family are I/O optimized.
-
Supported cloud disk types: ESSD, ESSD AutoPL, and Zone-redundant ESSD. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information, see IPv6 communication.
-
Network performance scales with the instance type.
-
The gn7s instance family includes the following instance types and specifications:
|
Instance type |
vCPUs |
Memory (GiB) |
GPUs |
GPU memory |
Network bandwidth (Gbit/s) |
Packet rate (pps) |
Private IPv4s per ENI |
IPv6s per ENI |
Multi-queue |
ENIs |
|
ecs.gn7s-c8g1.2xlarge |
8 |
60 |
NVIDIA A30 * 1 |
24GB * 1 |
16 |
1,600,000 |
5 |
1 |
8 |
4 |
|
ecs.gn7s-c16g1.4xlarge |
16 |
120 |
NVIDIA A30 * 1 |
24GB * 1 |
16 |
3,000,000 |
5 |
1 |
8 |
8 |
|
ecs.gn7s-c32g1.8xlarge |
32 |
250 |
NVIDIA A30 * 1 |
24GB * 1 |
16 |
6,000,000 |
5 |
1 |
12 |
8 |
|
ecs.gn7s-c32g1.16xlarge |
64 |
500 |
NVIDIA A30 * 2 |
24GB * 2 |
32 |
12,000,000 |
5 |
1 |
16 |
15 |
|
ecs.gn7s-c32g1.32xlarge |
128 |
1000 |
NVIDIA A30 * 4 |
24GB * 4 |
64 |
24,000,000 |
10 |
1 |
32 |
15 |
|
ecs.gn7s-c48g1.12xlarge |
48 |
380 |
NVIDIA A30 * 1 |
24GB * 1 |
16 |
9,000,000 |
8 |
1 |
16 |
8 |
|
ecs.gn7s-c56g1.14xlarge |
56 |
440 |
NVIDIA A30 * 1 |
24GB * 1 |
16 |
10,000,000 |
8 |
1 |
16 |
8 |
gn7, GPU-accelerated compute-optimized instance family
-
Scenarios:
-
Deep learning, such as training AI algorithms used in image classification, autonomous driving, and speech recognition.
-
GPU-intensive scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
-
-
Storage:
-
Instances are I/O optimized.
-
Supports ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Zone-redundant cloud disks. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information about IPv6, see IPv6 communication.
-
Network performance scales with the instance type.
-
The following table describes the instance types and specifications of the gn7 instance family.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU memory |
Network bandwidth (Gbit/s) |
Packet rate (pps) |
NIC queues |
ENIs |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn7-c12g1.3xlarge |
12 |
94 |
40 GB × 1 |
4 |
2,500,000 |
4 |
8 |
10 |
1 |
|
ecs.gn7-c13g1.13xlarge |
52 |
378 |
40 GB × 4 |
16 |
9,000,000 |
16 |
8 |
30 |
30 |
|
ecs.gn7-c13g1.26xlarge |
104 |
756 |
40 GB × 8 |
30 |
18,000,000 |
16 |
15 |
10 |
1 |
gn7r, GPU-accelerated compute-optimized instance family
-
Overview:
-
The gn7r is an enterprise-grade, multi-purpose instance family from Alibaba Cloud that combines an Arm processor with a GPU. It provides a cloud-native platform for developing and running Android-based applications, cloud phones, and cloud gaming services. Equipped with NVIDIA A16 GPUs, these instances provide cost-effective, multi-chip hardware transcoding comparable to ASIC-based platforms. The instances also support the CUDA compute architecture, enabling direct AI recognition and analysis on the GPU after decoding.
-
Based on the third-generation SHENLONG architecture, these instances use the CIPU cloud processor to manage cloud resources, delivering stable, predictable, and ultra-high compute, storage, and network performance.
-
These instances use NVIDIA A16 GPU accelerators for graphics acceleration, hardware transcoding, and AI services.
NoteEach NVIDIA A16 card contains four GA107 processing chips.
-
-
Use cases: Ideal for remote Android application services, such as cloud application standby, cloud gaming, cloud phones, Android data crawlers, video transcoding, video recognition, content review, and video editing.
-
Compute:
-
Processor: 3.0 GHz Ampere® Altra® Max processors. The native Arm compute platform provides efficient performance and excellent app compatibility for Android servers.
-
-
Storage:
-
I/O optimized instance.
-
Supported cloud disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information, see IPv6 communication.
-
The gn7r instance family includes the following instance types and specifications.
|
Instance type |
vCPUs |
Memory (GiB) |
GPU |
Base network bandwidth (Gbit/s) |
Packet forwarding PPS |
Private IPv4 addresses |
IPv6 addresses |
NIC queues |
ENIs |
|
ecs.gn7r-c16g1.4xlarge |
16 |
64 |
NVIDIA GA107 × 1 |
8 |
3,000,000 |
15 |
1 |
8 |
8 |
gn6i, GPU-accelerated compute-optimized instance family
-
Use cases:
-
AI (deep learning and machine learning) inference for applications such as computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems.
-
Real-time rendering for cloud gaming.
-
Real-time, cloud-based rendering for augmented reality (AR) and virtual reality (VR).
-
Graphics-heavy computing or graphics workstations.
-
GPU-accelerated databases.
-
High-performance computing (HPC).
-
-
Compute:
-
Equipped with NVIDIA T4 GPU accelerators, featuring:
-
The innovative NVIDIA Turing architecture.
-
16 GB of memory per GPU with a memory bandwidth of 320 GB/s.
-
2,560 CUDA cores per GPU.
-
Up to 320 Turing Tensor Cores per GPU.
-
Mixed-precision Tensor Cores that support 65 TFLOPS of FP16, 130 TOPS of INT8, and 260 TOPS of INT4.
-
-
vCPU-to-memory ratio of approximately 1:4.
-
Processor: 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake).
-
-
Storage:
-
I/O-optimized instances.
-
Supported disk types: ESSD, ESSD AutoPL disks, SSD disk, and ultra disk. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For details, see IPv6 communication.
-
Network performance scales with the instance type.
-
The gn6i instance family includes the following instance types.
|
Instance type |
vCPUs |
Memory (GiB) |
GPUs |
GPU memory |
Network bandwidth (Gbit/s) |
Packet rate (pps) |
Disk IOPS |
Multi-queue |
ENIs |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn6i-c4g1.xlarge |
4 |
15 |
NVIDIA T4 × 1 |
16 GB × 1 |
4 |
2,500,000 |
N/A |
2 |
2 |
10 |
1 |
|
ecs.gn6i-c8g1.2xlarge |
8 |
31 |
NVIDIA T4 × 1 |
16 GB × 1 |
5 |
2,500,000 |
N/A |
2 |
2 |
10 |
1 |
|
ecs.gn6i-c16g1.4xlarge |
16 |
62 |
NVIDIA T4 × 1 |
16 GB × 1 |
6 |
2,500,000 |
N/A |
4 |
3 |
10 |
1 |
|
ecs.gn6i-c24g1.6xlarge |
24 |
93 |
NVIDIA T4 × 1 |
16 GB × 1 |
7.5 |
2,500,000 |
N/A |
6 |
4 |
10 |
1 |
|
ecs.gn6i-c40g1.10xlarge |
40 |
155 |
NVIDIA T4 × 1 |
16 GB × 1 |
10 |
2,500,000 |
N/A |
16 |
10 |
10 |
1 |
|
ecs.gn6i-c24g1.12xlarge |
48 |
186 |
NVIDIA T4 × 2 |
16 GB × 2 |
15 |
4,500,000 |
N/A |
12 |
6 |
10 |
1 |
|
ecs.gn6i-c24g1.24xlarge |
96 |
372 |
NVIDIA T4 × 4 |
16 GB × 4 |
30 |
4,500,000 |
250,000 |
24 |
8 |
10 |
1 |
gn6e, GPU-accelerated compute-optimized instance family
-
Use cases:
-
Deep learning applications, such as training and inference for AI algorithms for image classification, autonomous driving, and speech recognition.
-
Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
-
-
Compute:
-
Features NVIDIA V100 (32 GB NVLink) GPU cards.
-
GPU accelerator: V100 (SXM2 package).
-
Innovative NVIDIA Volta architecture.
-
32 GB of HBM2 memory per GPU with a GPU memory bandwidth of 900 GB/s.
-
5,120 CUDA Cores per GPU.
-
640 Tensor Cores per GPU.
-
Each GPU supports six bidirectional NVLink connections, each providing 25 Gbit/s of bandwidth in each direction for a total of 300 Gbit/s.
-
-
Features a vCPU-to-memory ratio of approximately 1:8.
-
Processor: 2.5 GHz Intel ® Xeon ® Platinum 8163 (Skylake).
-
-
Storage:
-
I/O optimized instance.
-
Supported cloud disk types: ESSDs, ESSD AutoPL disks, Regional ESSDs, standard SSDs, and ultra disks. For more information, see Elastic Block Storage.
-
-
Network:
-
Supports both IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
-
Network performance scales with the instance type.
-
gn6e includes the instance types and specifications listed in the table below.
|
Instance type |
vCPU |
Memory (GiB) |
GPU |
GPU memory |
Baseline bandwidth (Gbit/s) |
Packet rate (PPS) |
NIC queues |
ENI |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn6e-c12g1.3xlarge |
12 |
92 |
1 × NVIDIA V100 |
1 × 32 GB |
5 |
800,000 |
8 |
6 |
10 |
1 |
|
ecs.gn6e-c12g1.6xlarge |
24 |
184 |
2 × NVIDIA V100 |
2 × 32 GB |
8 |
1,200,000 |
8 |
8 |
20 |
1 |
|
ecs.gn6e-c12g1.12xlarge |
48 |
368 |
4 × NVIDIA V100 |
4 × 32 GB |
16 |
2,400,000 |
8 |
8 |
20 |
1 |
|
ecs.gn6e-c12g1.24xlarge |
96 |
736 |
8 × NVIDIA V100 |
8 × 32 GB |
32 |
4,500,000 |
16 |
8 |
20 |
1 |
gn6v, GPU-accelerated compute-optimized instance family
-
Use cases:
-
Deep learning applications, such as training and inference for AI algorithms in image classification, autonomous driving, and speech recognition.
-
Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
-
-
Compute:
-
Equipped with NVIDIA V100 GPUs.
-
GPU accelerator: V100 (SXM2 package).
-
Innovative NVIDIA Volta architecture.
-
16 GB of HBM2 GPU memory per GPU with 900 GB/s of memory bandwidth.
-
5,120 CUDA Cores per GPU.
-
640 Tensor Cores per GPU.
-
Up to six NVLink bidirectional connections per GPU. Each connection provides a bandwidth of 25 Gbit/s in each direction, for a total bandwidth of 300 Gbit/s.
-
-
Features a vCPU-to-memory ratio of approximately 1:4.
-
Processor: 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake).
-
-
Storage:
-
All instances in this family are I/O optimized.
-
Supported disk types: ESSD, ESSD AutoPL, SSD Cloud Disk, and Ultra Disk. For more information, see Block storage overview.
-
-
Network:
-
Supports IPv4 and IPv6. For more information, see IPv6 communication.
-
Network performance scales with the instance type.
-
The gn6v instance family includes the instance types and specifications listed below.
|
Instance type |
vCPU |
Memory (GiB) |
GPU |
GPU memory |
Network bandwidth (Gbit/s) |
Packet rate (pps) |
Disk baseline IOPS |
Multi-queue |
ENI |
Private IPv4 addresses |
IPv6 addresses |
|
ecs.gn6v-c8g1.2xlarge |
8 |
32 |
1 × NVIDIA V100 |
1 × 16 GB |
2.5 |
800,000 |
N/A |
4 |
4 |
10 |
1 |
|
ecs.gn6v-c8g1.4xlarge |
16 |
64 |
2 × NVIDIA V100 |
2 × 16 GB |
5 |
1,000,000 |
N/A |
4 |
8 |
20 |
1 |
|
ecs.gn6v-c8g1.8xlarge |
32 |
128 |
4 × NVIDIA V100 |
4 × 16 GB |
10 |
2,000,000 |
N/A |
8 |
8 |
20 |
1 |
|
ecs.gn6v-c8g1.16xlarge |
64 |
256 |
8 × NVIDIA V100 |
8 × 16 GB |
20 |
2,500,000 |
N/A |
16 |
8 |
20 |
1 |
|
ecs.gn6v-c10g1.20xlarge |
82 |
336 |
8 × NVIDIA V100 |
8 × 16 GB |
35 |
4,500,000 |
250,000 |
16 |
8 |
20 |
1 |
GPU elastic bare metal server instance family ebmgn9g
The ebmgn9g instance family is available by invitation only. To request access, submit a ticket.
Instance family introduction: The ebmgn9g instance family is Alibaba Cloud's ninth-generation line of cost-effective, full-featured GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large memory capacity, and new Blackwell architecture professional GPUs. They deliver high-performance and cost-effective GPU cloud services for a wide range of accelerated workloads, such as training for autonomous driving and embodied intelligence, large model inference, film and animation rendering, and services for the metaverse and cloud gaming.
Use cases and features:
Autonomous driving/embodied intelligence:
With 256 vCPUs running at an all-core frequency of up to 4.2 GHz and 2,304 GiB of memory, these instances handle the demanding data processing requirements for autonomous driving and embodied intelligence training.Search and recommendation:
The Blackwell GPU provides 126 TFLOPS of TF32 computing power. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, making it an optimal configuration for search and advertising services.Large model inference:
The new-generation GPU offers a significant performance leap over its predecessor, with video memory bandwidth increased to 1,344 GB/s. Support for FP4 computing power significantly improves inference performance and cost-effectiveness. The eight GPUs are interconnected via PCIe Gen5 with a bandwidth of 128 GB/s. This connection dramatically improves multi-card parallel inference efficiency.Cloud gaming/rendering/metaverse:
With a CPU frequency of up to 5 GHz, these instances are an excellent choice for 3D modeling. The GPU provides certified, workstation-grade graphics drivers with full OpenGL acceleration, making these instances the optimal choice for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The second-generation CIPU offers increased processing power, enhancing the performance of eRDMA, VPC, and cloud disk components. Bare metal instances are ideal for workloads that require direct access to physical resources or have requirements such as hardware-based licensing. They also support containers, including but not limited to Docker, Clear Containers, and PouchContainer.
Compute:
Powered by new Blackwell architecture professional GPUs:
Supports professional-grade OpenGL graphics processing.
Supports common acceleration features such as RTX and TensorRT, with newly added support for FP4 and PCIe Gen5 interconnect.
Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-card sharding.
Key GPU specifications:
GPU architecture
Video memory
Compute performance
Encode/decode engines
Inter-GPU interconnect
Acceleration APIs
Blackwell
Capacity: 48 GB
Bandwidth: 1,344 GB/s
TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT core: 196 TFLOPS
3 × Video Encoder
3 × Video Decoder
PCIe Gen5 x16: 128 GB/s
Supports P2P
Supports DX12,
OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute
Processor: AMD Turin-C processors with a frequency range of 3.3 GHz to 5 GHz and an all-core turbo of up to 4.2 GHz.
Storage:
I/O-optimized instance.
Supports the NVMe protocol. For more information, see NVMe protocol overview.
Supported cloud disk types: elastic temporary disk, ESSD, ESSD AutoPL, and ESSD with zone-redundancy. For more information about block storage, see Block storage overview.
Network:
Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers network performance of up to 30 million Network PPS.
Supports eRDMA, which enables RDMA pass-through acceleration over VPC networks. eRDMA increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
NoteFor instructions on how to use eRDMA, see Enable eRDMA for an enterprise-level instance or Enable eRDMA for a GPU instance.
The following table lists the specifications for the ebmgn9g instance family.
Instance type | vCPU | Memory (GiB) | Video memory | Network bandwidth (Gbit/s) | Network PPS | Private IPv4 addresses | IPv6 addresses | Queues (primary/secondary) | ENIs | Max data disks | Cloud disk bandwidth (GB/s) |
ecs.ebmgn9g.64xlarge | 256 | 2304 | 48 GB × 8 | 360 (180 × 2) | 30 million | 30 | 30 | 64/16 | 38 | 33 | 8 |
ebmgn9g instances require images that use the UEFI boot mode. If you use a custom image, ensure it supports the UEFI boot mode and that its boot mode property is set to UEFI. For more information, see Instance boot mode.
GPU elastic bare metal server instance family ebmgn9ge
The ebmgn9ge instance family is available by invitation only. To request access, submit a ticket.
Introduction: The ebmgn9ge instance family is Alibaba Cloud's 9th generation of full-featured, cost-effective GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large memory capacity, and new Blackwell architecture professional GPUs. They deliver high-performance and cost-effective GPU cloud services for a wide range of accelerated workloads, including autonomous driving and embodied intelligence training, large model inference, film and animation rendering, and metaverse and cloud gaming services.
Use cases and features:
Autonomous driving and embodied intelligence:
With 256 vCPUs running at an all-core frequency of over 4.2 GHz and 2.3 TB of memory, these instances meet the demanding data processing needs of autonomous driving and embodied intelligence training.Search and recommendation:
The Blackwell GPU provides 126 TFLOPS of TF32 computing power. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, providing an optimal configuration for search and advertising services.Large model inference:
The ebmgn9ge instance family is specifically designed for large language models, offering 72 GB of GPU memory per GPU. The GPU memory bandwidth reaches 1,344 GB/s, delivering high-performance inference for LLM scenarios. Combined with the new FP4 compute architecture and 128 GB/s of PCIe Gen5 bandwidth, an instance can support parallel inference for models larger than 671 billion parameters across 8 GPUs.
Cloud gaming, rendering, and metaverse:
With a CPU frequency of up to 5 GHz, these instances are well-suited for 3D modeling. The GPU natively supports graphics capabilities and provides certified workstation-grade graphics drivers with full OpenGL acceleration, making them ideal for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The 2nd generation CIPU delivers greater cloud processing power, enhancing performance for eRDMA, VPC, and EBS components. Bare metal instances allow workloads to directly access physical resources or meet requirements such as hardware-based licensing. They also support containers, such as Docker, Clear Containers, and Pouch.
Compute:
Powered by new Blackwell architecture professional GPUs:
Supports professional-grade OpenGL graphics processing.
Supports common acceleration features like RTX and TensorRT, and adds support for FP4 compute and the PCIe Gen5 interconnect.
Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-GPU sharding.
GPU key specifications:
GPU architecture
GPU memory
Compute performance
Video encode/decode engines
Inter-GPU interconnect
Accelerated APIs
Blackwell
Capacity: 72 GB
Bandwidth: 1,344 GB/s
TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT core: 196 TFLOPS
3 × Video Encoder
3 × Video Decoder
PCIe Gen5 x16: 128 GB/s
Supports P2P
Supports DX12,
OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute
Processor: AMD Turin-C processors with a frequency range of 3.3 GHz to 5 GHz and an all-core turbo of up to 4.2 GHz.
Storage:
I/O-optimized instance.
Supports the NVMe protocol. For more information, see NVMe Protocol Overview.
Supported cloud disk types: elastic temporary disk, ESSD, ESSD AutoPL disks, and ESSD Zone-Redundant Disk. For more information about block storage, see Block Storage Overview.
Network:
Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 Communication.
Delivers network performance of up to 30 million network packets per second (PPS).
Supports ERI (Elastic RDMA Interface), which enables RDMA pass-through acceleration over VPC networks. ERI increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
NoteFor instructions on how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
The following table lists the instance types and specifications for the ebmgn9ge instance family.
For a lower-cost version with less memory, you can select ebmgn9gc.
Instance type | vCPU | Memory (GiB) | GPU memory | Baseline network bandwidth (Gbit/s) | Network PPS | Private IPv4 addresses per ENI | IPv6 addresses per ENI | Queue pairs (primary/secondary ENI) | ENIs | Max attachable data disks | Max cloud disk bandwidth (GB/s) |
ecs.ebmgn9ge.64xlarge | 256 | 2,304 | 72 GB × 8 | 360 (180 × 2) | 30 million | 30 | 30 | 64/16 | 38 | 33 | 8 |
Instances of the ebmgn9ge instance family require an image that uses the UEFI boot mode. If you use a custom image, ensure that the image supports the UEFI boot mode and that its boot mode property is set to UEFI. For more information, see Instance Boot Mode.
GPU-accelerated elastic bare metal instance family ebmgn9gc
The ebmgn9gc instance family is available by invitation only. To request access, submit a ticket.
Introduction: The ebmgn9gc instance family is Alibaba Cloud's 9th generation of full-featured, cost-effective GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large-capacity memory, and new Blackwell architecture professional GPUs to deliver cost-effective GPU cloud services for a wide range of accelerated workloads, including autonomous driving and embodied intelligence training, large model inference, film and animation rendering, and metaverse and cloud gaming services.
Use cases and features:
Autonomous driving and embodied intelligence:
Offers 256 vCPUs with a frequency range of 3.3 GHz to 5.0 GHz and an all-core turbo of up to 4.2 GHz. Paired with 1.5 TB of memory, these instances meet the data processing demands of autonomous driving and embodied intelligence training.Search and recommendation:
The equipped Blackwell GPUs provide 126 TFLOPS of high-performance TF32 compute. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, providing an optimal configuration for search and advertising services.Large model inference:
The ebmgn9gc instances are designed for large language models. With 72 GB of GPU memory and 1,344 GB/s of memory bandwidth per GPU, they deliver high-performance inference for LLM scenarios. Combined with the new FP4 compute architecture and 128 GB/s of PCIe Gen5 bandwidth, an 8-card instance can support parallel inference for models with over 671 billion parameters.
Cloud gaming, rendering, and metaverse:
With a CPU frequency of up to 5.0 GHz, these instances are well-suited for 3D modeling. The GPUs natively support graphics capabilities and provide certified workstation-grade graphics drivers with full OpenGL acceleration, making them a good choice for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The 2nd generation CIPU offers higher cloud processing power and enhanced computing capabilities for eRDMA, VPC, and block storage. Bare metal instances allow workloads to directly access physical resources or meet requirements such as hardware-based licensing. They also support containers, such as Docker, Clear Container, and Pouch.
Compute:
Powered by new Blackwell architecture professional GPUs:
Supports professional-grade OpenGL graphics processing features.
Supports common acceleration features such as RTX and TensorRT, and adds support for FP4 and PCIe Gen5 interconnect.
Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-card sharding.
GPU key parameters:
GPU architecture
GPU memory
Compute performance
Video encoding/decoding
GPU-to-GPU interconnect
Acceleration APIs
Blackwell
Capacity: 72 GB
Bandwidth: 1,344 GB/s
TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT Core: 196 TFLOPS
3 × Video Encoder
3 × Video Decoder
PCIe Gen5 x16: 128 GB/s
Supports P2P
Supports DX12,
OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute
Processor: AMD Turin-C processor with a frequency range of 3.3 GHz to 5.0 GHz and an all-core turbo of up to 4.2 GHz.
Storage:
I/O optimized instance.
Supports the NVMe protocol. For more information, see NVMe protocol overview.
Supported cloud disk types: Elastic Temporary Disk, ESSD, ESSD AutoPL, and ESSD Zone-Redundant Disk. For more information about block storage, see Block Storage Overview.
Network:
Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers network performance of up to 30 million packets per second (PPS).
Supports elastic RDMA (eRDMA), which enables RDMA pass-through acceleration over VPC networks. eRDMA increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
NoteFor instructions on how to use eRDMA, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
The following table lists the instance types and specifications for the ebmgn9gc instance family.
For a version with more memory, you can select ebmgn9ge.
Instance type | vCPU | Memory (GiB) | GPU memory | Network bandwidth (Gbit/s) | Network PPS | Private IPv4 addresses | IPv6 addresses | Multi-queue (primary/secondary) | Elastic network interfaces | Maximum data disks | Cloud disk bandwidth (GB/s) |
ecs.ebmgn9gc.64xlarge | 256 | 1536 | 72 GB × 8 | 360 (180 × 2) | 30,000,000 | 30 | 30 | 64/16 | 38 | 33 | 8 |
Instances in the ebmgn9gc instance family must be launched from an image that uses the UEFI boot mode. If you use a custom image, ensure that it supports the UEFI boot mode and its boot mode property is set to UEFI. For more information, see Instance boot mode.
GPU compute-optimized elastic bare metal server instance family ebmgn8v
This instance family is currently available only in select regions. To request access, contact your Alibaba Cloud sales representative.
Instance family overview: ebmgn8v is Alibaba Cloud's 8th-generation accelerated computing instance family (Elastic Bare Metal Server instance family), designed for AI model training and ultra-large parameter models. Each instance is a bare metal host equipped with eight GPU cards.
Use cases:
Cost-effective for multi-GPU parallel inference on large language models (LLMs) with more than 70 billion parameters.
Each GPU delivers 39.5 TFLOPS of FP32 compute, making it ideal for traditional AI model training and autonomous driving workloads.
NVLINK interconnection among the eight GPUs supports small- and medium-sized model training scenarios.
Key features and positioning:
High-speed, large-capacity GPU memory: Each GPU has 96 GB of HBM3 memory, delivering a memory bandwidth of 4 TB/s. This significantly accelerates model training and inference.
High inter-GPU bandwidth: The GPUs are interconnected via 900 GB/s NVLINK, providing significantly higher efficiency for multi-GPU training and inference compared to previous GPU generations.
Large model quantization technology: Supports FP8 compute, which optimizes performance for training and inference with large-scale parameters. This significantly increases computational speed and reduces GPU memory usage.
Compute:
Powered by the latest CIPU (Cloud Processing Unit) 1.0:
Decouples compute and storage, enabling flexible selection of storage resources. Compared to 7th-generation GPU instances, the inter-host bandwidth is increased to 160 Gbit/s for faster data transfer and processing.
Unlike traditional virtualized instances, the CIPU provides bare metal capabilities that support P2P communication between GPU instances.
Powered by 4th Generation Intel® Xeon® Scalable processors, providing 192 vCPUs with an all-core turbo frequency of up to 3.1 GHz.
Storage:
-
These are I/O optimized instances.
-
These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
Supported cloud disk types: elastic temporary disks, ESSD cloud disks, ESSD AutoPL cloud disks, and zone-redundant ESSD cloud disks. For more information about cloud disks, see Block Storage Overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers ultra-high network performance of up to 30 million PPS.
Supports Elastic RDMA Interface (ERI), which enables RDMA pass-through acceleration over VPC networks. ERI increases bandwidth to 160 Gbit/s and is ideal for computer vision (CV) and traditional model training workloads.
NoteFor instructions on how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
The following table describes the instance types and specifications of the ebmgn8v instance family.
Instance type | vCPU | Memory (GiB) | GPU memory | Baseline network bandwidth (Gbit/s) | Packet forwarding rate (PPS) | Private IPv4s per ENI | IPv6s per ENI | Multi-queue | ENIs | Max data disks | Max cloud disk bandwidth (GB/s) |
ecs.ebmgn8v.48xlarge | 192 | 1,024 | 96 GB × 8 | 170 (85 × 2) | 30 million | 30 | 30 | 64 | 32 | 31 | 6 |
The boot mode for images used by ebmgn8v instances must be UEFI. If you use a custom image, ensure that the image supports the UEFI boot mode and that its boot mode attribute is set to UEFI. For more information, see Instance boot modes.
GPU-accelerated ebmgn8ia instance family
This instance family is currently available only in select regions. To request access, contact your Alibaba Cloud sales representative.
Introduction: The ebmgn8ia, an elastic bare metal instance family, is Alibaba Cloud's eighth-generation accelerated computing instance family. It is designed for search and recommendation, simulation, and other GPU-intensive workloads that require a high vCPU-to-GPU ratio. Powered by the latest NVIDIA L20 GPUs, each instance is a bare metal server equipped with two high-frequency processors and four GPUs.
Features and use cases:
High frequency: These instances are equipped with two AMD EPYC™ Genoa 9T34 processors. Each processor has 64 physical cores, providing a total of 256 vCPUs per instance, with a frequency range of 3.4 GHz to 3.75 GHz. This significantly boosts single-core CPU performance, making it ideal for CAD modeling and accelerating preprocessing for CAE simulations.
Sparse resource ratio: Each GPU is paired with 64 vCPUs and 384 GiB of memory, delivering an average memory bandwidth of 230 GB/s per GPU. This configuration is suitable for GPU computing scenarios that require high I/O throughput, such as advertising, search, recommendation, traditional CAE simulations, and certain CPU-based rendering tasks in film production.
Powered by the latest CIPU 1.0:
Decouples compute and storage, allowing you to flexibly select storage resources. Compared to the previous generation, the inter-machine bandwidth of this instance type is increased to 160 Gbit/s for faster data transfer and processing.
The CIPU provides bare metal capabilities, enabling PCIe P2P communication between GPU instances, an improvement over traditional virtualized instances.
Compute:
Powered by new enterprise-class NVIDIA L20 GPUs:
Supports common acceleration features such as vGPU, RTX, and TensorRT.
Supports FP8 precision to improve computational efficiency.
NVIDIA L20 key specifications:
GPU architecture
GPU memory
Compute performance
Video encode/decode engines
GPU interconnect
NVIDIA Ada Lovelace
Capacity: 48 GB
Bandwidth: 864 GB/s
FP64: N/A
FP32: 59.3 TFLOPS
FP16/BF16: 119 TFLOPS
FP8/INT8: 237 TFLOPS
3 × Video Encoder (+AV1)
3 × Video Decoder
4 × JPEG Decoder
PCIe interface: PCIe Gen4 x16
Bandwidth: 64 GB/s
Processor: AMD EPYC™ Genoa 9T34 processors with a frequency range of 3.4 GHz to 3.75 GHz.
Storage:
-
These are I/O optimized instances.
-
These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
Supported cloud disks: elastic temporary disk, ESSD cloud disk, ESSD AutoPL cloud disk, and ESSD Zone-redundant cloud disk. For more information, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers ultra-high network performance of up to 30,000,000 PPS.
Supports Elastic RDMA Interface (ERI), which enables RDMA passthrough over VPC networks. ERI increases bandwidth to 160 Gbit/s and is ideal for training computer vision (CV) and traditional models.
NoteFor instructions on using ERI, see Enable eRDMA on an enterprise-level instance or Enable eRDMA on a GPU-accelerated instance.
The following table lists the instance types and specifications for the ebmgn8ia instance family.
Instance type | vCPU | Memory (GiB) | GPU | GPU memory | Baseline network bandwidth (Gbit/s) | Network PPS | Private IPv4 addresses per ENI | IPv6 addresses per ENI | Queue number (Primary/Secondary ENI) | Elastic network interfaces | Max data disks | Max cloud disk bandwidth (GB/s) |
ecs.ebmgn8ia.64xlarge | 256 | 1536 | L20 × 4 | 48 GB × 4 | 160 (80 × 2) | 30,000,000 | 30 | 30 | 64/16 | 32 | 31 | 6 |
Instances of the ebmgn8ia instance family require an image that uses the UEFI boot mode. If you use a custom image, it must support UEFI boot, and its boot mode property must be set to UEFI. For more information, see Instance boot mode.
GPU compute-optimized elastic bare metal family ebmgn8is
This instance family is currently available only in select regions outside the Chinese mainland. For inquiries, contact your Alibaba Cloud sales representative.
Instance family overview: The ebmgn8is is the 8th-generation accelerated computing instance family from Alibaba Cloud. As an elastic bare metal instance family, it is built for the growing demand in AI-generated workloads. Powered by the latest NVIDIA L20 GPUs, each instance is a bare metal host with eight GPU cards.
Features and positioning:
Graphics processing: These instances are powered by high-frequency 4th Generation Intel® Xeon® Scalable processors. They provide ample CPU computing power for 3D modeling, ensuring smoother graphics rendering and design.
Inference tasks: Powered by the new NVIDIA L20 GPUs, where each card provides 48 GB of memory to accelerate inference tasks. These instances support the FP8 floating-point format. Combined with Alibaba Cloud Container Service for Kubernetes (ACK), they flexibly support inference on various AIGC models and are especially suitable for inference tasks on large language models (LLMs) with up to 70 billion parameters.
Training tasks: These instances offer cost-effective computing power, delivering double the FP32 compute performance compared to 7th-generation inference instances. They are particularly suitable for training computer vision (CV) models that use FP32, and for training other small to medium-sized models.
Use cases:
Use GRID images with GRID graphics drivers from the Cloud Marketplace to enable OpenGL and Direct3D graphics capabilities. This provides workstation-grade graphics processing for workloads such as animation, film and television special effects, and rendering.
Combined with the containerized management capabilities of ACK, you can more efficiently and cost-effectively support AIGC graphics generation and large language model (LLM) inference (up to 130 billion parameters).
Other general AI scenarios, such as image recognition and speech recognition.
Powered by the latest CIPU 1.0:
Features decoupled compute and storage, which allows you to flexibly select the required storage resources. Compared with the previous generation, the inter-instance bandwidth of this instance family is increased to 160 Gbit/s to enable faster data transfer and processing.
CIPU provides bare metal capabilities that enable PCIe P2P communication between GPUs, which is not supported by traditional virtualized instances.
Compute:
Powered by new enterprise-class NVIDIA L20 GPUs:
Supports common acceleration features such as vGPU, RTX, and TensorRT.
Uses a PCIe switch for interconnection. Compared with a direct-to-CPU connection, this design improves NCCL performance by 36% and can increase inference performance by up to 9% when you run sharded inference for large models across multiple cards.
Key NVIDIA L20 parameters:
GPU architecture
GPU memory
Compute performance
Video encoding/decoding
Inter-GPU interconnect
NVIDIA Ada Lovelace
Capacity: 48 GB
Bandwidth: 864 GB/s
FP64: N/A
FP32: 59.3 TFLOPS
FP16/BF16: 119 TFLOPS
FP8/INT8: 237 TFLOPS
3 × Video Encoder (+AV1)
3 × Video Decoder
4 × JPEG Decoder
PCIe interface: PCIe Gen4 x16
Bandwidth: 64 GB/s
Processor: Intel® Xeon® Scalable processors (SPR) with a 3.4 GHz base frequency and an all-core turbo frequency of up to 3.9 GHz.
Storage:
-
These are I/O optimized instances.
-
These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPLs, and zone-redundant ESSDs. For more information about cloud disks, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers ultra-high network performance with a packet forwarding rate of 30 million PPS.
Supports ERI (Elastic RDMA Interface), which enables RDMA passthrough acceleration in a VPC network. ERI increases bandwidth to 160 Gbit/s and is ideal for CV and traditional model training workloads.
NoteTo use ERI, see Enable on enterprise-level instance or Enable eRDMA for a GPU instance.
The following table describes the instance types and specifications of the ebmgn8is instance family.
Instance type | vCPUs | Memory (GiB) | GPU | GPU memory | Base bandwidth (Gbit/s) | Packet forwarding rate (PPS) | Private IPv4 addresses | IPv6 addresses | Queue pairs (primary/secondary ENI) | ENIs | Attachable data disks | Cloud disk bandwidth (GB/s) |
ecs.ebmgn8is.32xlarge | 128 | 1024 | L20 × 8 | 48 GB × 8 | 160 (80 × 2) | 30,000,000 | 30 | 30 | 64/16 | 32 | 31 | 6 |
The image boot mode for ebmgn8is instance types must be UEFI. If you use a custom image, you must ensure that it supports UEFI and that its boot mode attribute is set to UEFI. For more information, see Instance boot modes.
GPU compute-optimized ECS bare metal instance family ebmgn7ex
Instance family description: The ebmgn7ex instance family provides high-bandwidth instances for large-scale AI training. Built on the fourth-generation SHENLONG architecture and Alibaba Cloud's new CIPU architecture, ebmgn7ex instances use an eRDMA network to interconnect multiple bare metal hosts, enabling RDMA communication with up to 160 Gbit/s of interconnect bandwidth. After enabling eRDMA, you can elastically scale instances in your cluster for large-scale AI training.
Use cases:
Various deep learning training and development workloads.
HPC-accelerated computing and simulation.
ImportantWhen you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
Compute:
Processor: 3rd Generation Intel® Xeon® Scalable processor (Icelake) with a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and PCIe 4.0 support.
Storage:
-
These are I/O optimized instances.
-
Supported disk types: elastic ephemeral disk, ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Block Storage Overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Supports physical NICs.
Delivers ultra-high network performance with a packet forwarding rate of up to 24 million PPS.
Supports Elastic RDMA Interface (ERI), which enables RDMA pass-through for accelerated communication within a VPC network. You can attach two ERIs to an instance. If each ERI is connected to a different network card index, the instance achieves a network bandwidth of 160 Gbit/s. If all ERIs are connected to the same network card index, the instance reaches a network bandwidth of up to 100 Gbit/s. For more information, see AttachNetworkInterface.
NoteFor instructions on how to use ERIs, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
The following table lists the instance types and specifications for the ebmgn7ex instance family.
Instance type | vCPUs | Memory (GiB) | GPU memory | Network bandwidth (Gbit/s) | Packet rate (PPS) | Private IPv4s per NIC | IPv6s per NIC | Physical NICs | Multi-queue (primary/secondary) | ENIs |
ecs.ebmgn7ex.32xlarge | 128 | 1024 | 80 GB * 8 | 160 (80 * 2) | 24 million | 30 | 30 | 2 | 32/32 | 16 |
Ebmgn7ex instances require an image that uses UEFI boot mode. If you use a custom image, ensure it supports UEFI boot mode and that its boot mode property is set to UEFI. For detailed steps, see Instance boot mode.
Ebmgn7e instance family
Instance family description: The ebmgn7e is an instance family built on the X-Dragon architecture. It combines powerful hardware performance with software-defined flexibility and elasticity.
Use cases:
Deep learning training and development.
HPC-accelerated computing and simulations.
ImportantWhen you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
Compute:
Processor: Features an Intel® Xeon® Scalable processor with a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and support for PCIe 4.0.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSD cloud disk, ESSD AutoPL cloud disk, and ESSD Zone-redundant cloud disk. For more information about cloud disks, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Achieves a packet forwarding rate of up to 24 million PPS.
The following table lists the instance types and specifications for the ebmgn7e instance family.
Instance type | vCPU | Memory (GiB) | GPU memory | Network bandwidth (Gbit/s) | Network PPS | Multi-queue (primary/secondary) | ENIs | Private IPv4 addresses | IPv6 addresses |
ecs.ebmgn7e.32xlarge | 128 | 1024 | 80 GB × 8 | 64 | 24 million | 32/12 | 32 | 10 | 1 |
After starting an ebmgn7e instance, you must check if the Multi-Instance GPU (MIG) feature is enabled or disabled because its default state is not guaranteed. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.
The following table describes whether ebmgn7e instances support the MIG feature.
Instance type | MIG feature support | Description |
ecs.ebmgn7e.32xlarge | Yes | You can enable the MIG feature on ebmgn7e bare metal instances. |
GPU-accelerated elastic bare metal instance family ebmgn7ix
Introduction:
ebmgn7ix is a new Elastic Bare Metal Server instance type family from Alibaba Cloud, launched to support the rapid growth of AI-generated workloads. Each instance is a bare metal host equipped with eight NVIDIA A10 GPUs.
Powered by the latest CIPU 1.0 cloud processor, this instance type family decouples compute and storage, allowing you to flexibly select storage resources. Compared to the previous generation, the inter-instance bandwidth is increased to 160 Gbit/s, enabling faster data transfer and processing for small-scale, multi-machine training workloads.
This instance type family provides bare metal capabilities. Unlike traditional virtualized instances, it supports peer-to-peer (P2P) communication between GPU instances, significantly improving multi-GPU computing efficiency.
Use cases:
Use GRID images from Alibaba Cloud Marketplace to activate the graphics capabilities of A10 GPUs. This provides efficient graphics processing for animation, film and television VFX, and rendering.
You can combine this instance type family with ACK for container management to efficiently and cost-effectively support AIGC graphics generation and LLM inference (up to 130 billion parameters).
Other general AI workloads, such as image and speech recognition.
Compute:
Powered by NVIDIA A10 GPUs:
Innovative Ampere architecture.
Supports common acceleration features such as vGPU, RTX, and TensorRT.
Processor: Intel® Xeon® Scalable processors (Ice Lake) with a base frequency of 2.9 GHz and an all-core turbo frequency of 3.5 GHz.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information about cloud disks, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers ultra-high network performance with a packet forwarding rate of up to 24 million PPS.
Supports Elastic RDMA Interface (ERI). This interface enables RDMA pass-through for accelerated interconnectivity within a VPC network, increasing bandwidth to 160 Gbit/s.
NoteTo use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
The following table describes the instance types and specifications of the ebmgn7ix instance type family.
Instance type | vCPU | Memory (GiB) | GPU | Network bandwidth (Gbit/s) | Packet rate (PPS) | Private IPv4 addresses | IPv6 addresses | Multi-queue | ENIs |
ecs.ebmgn7ix.32xlarge | 128 | 512 | NVIDIA A10 × 8 | 160 | 24,000,000 | 30 | 30 | 32/32 | 16 |
Launch ebmgn7ix instances from images that use the UEFI boot mode. If you use a custom image, ensure that it supports UEFI boot mode and that the boot mode property of the image is set to UEFI. For more information, see Instance boot mode.
GPU-accelerated ebmgn7i instance family
Introduction: The ebmgn7i is an instance family built on the X-Dragon architecture. These instances provide software-defined hardware computing, delivering both elasticity and powerful performance.
Use cases:
Equipped with high-performance CPUs, memory, and GPUs, these instances can handle high-volume concurrent AI inference workloads, making them ideal for image, speech, and behavior recognition.
RTX support, combined with high-frequency CPUs, delivers high-performance 3D graphics virtualization ideal for intensive graphics processing workloads like remote graphic design and cloud gaming.
RTX support, combined with high network and cloud disk bandwidth, makes these instances ideal for building high-performance rendering farms.
With multiple GPUs and high network bandwidth, these instances support small-scale deep learning training workloads.
Compute:
Powered by NVIDIA A10 GPUs:
Innovative Ampere architecture.
Supports common acceleration features such as vGPU, RTX, and TensorRT.
Processor: Intel® Xeon® Scalable processors (Ice Lake) with a 2.9 GHz base frequency and a 3.5 GHz all-core turbo frequency.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information on block storage, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Delivers ultra-high network performance with up to 24 million network PPS.
The following table lists the instance types and specifications for the ebmgn7i instance family.
Instance type | vCPU | Memory (GiB) | GPU | GPU memory | Network bandwidth (Gbit/s) | Network PPS | Multi-queue | Elastic network interfaces | Private IPv4 addresses | IPv6 addresses |
ecs.ebmgn7i.32xlarge | 128 | 768 | NVIDIA A10 × 4 | 24 GB × 4 | 64 | 24 million | 32 | 32 | 10 | 1 |
GPU-accelerated ebmgn7 instance family
Introduction: Based on the X-Dragon Architecture, the ebmgn7 instance type family features software-defined hardware, offering both flexibility and powerful computing performance.
Use cases:
Deep learning training workloads, such as image classification, autonomous driving, and speech recognition.
GPU-intensive scientific computing, such as computational fluid dynamics (CFD), computational finance, molecular dynamics, and environmental analysis.
Compute:
Processor: 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information about cloud disks, see the Block Storage Overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Network performance scales with the instance type.
The following table lists the instance types and specifications for the ebmgn7 instance type family.
Instance type | vCPU | Memory (GiB) | GPU memory | Baseline bandwidth (Gbit/s) | PPS | Multi-queue | ENIs | IPv4s per ENI | IPv6s per ENI |
ecs.ebmgn7.26xlarge | 104 | 768 | 40 GB × 8 | 30 | 18,000,000 | 16 | 15 | 10 | 1 |
The Multi-Instance GPU (MIG) feature may be either enabled or disabled by default when an ebmgn7 instance starts. You must check its status and enable or disable it as needed. For more information about MIG, see the NVIDIA Multi-Instance GPU User Guide.
The following table describes MIG support for ebmgn7 instances.
Instance type | MIG supported | Description |
ecs.ebmgn7.26xlarge | Yes | These instances support the MIG feature. |
Ebmgn6e GPU-accelerated instance family
Introduction:
The ebmgn6e instance family, built on the X-Dragon architecture, delivers software-defined hardware computing that combines flexibility, elasticity, and powerful performance.
These instances are powered by NVIDIA V100 (32 GB NVLink) GPU compute cards.
The GPU accelerator is the V100 (SXM2 package), which has the following features:
Innovative Volta architecture.
32 GB of HBM2 gpu memory per GPU (with a gpu memory bandwidth of 900 GB/s).
5,120 CUDA Cores per GPU.
640 Tensor Cores per GPU.
Each GPU supports six bidirectional NVLink links. Each unidirectional link provides 25 GB/s of bandwidth, totaling 300 GB/s (6 × 25 × 2).
Use cases:
Deep learning, such as training and inference for AI algorithms like image classification, autonomous driving, and speech recognition.
Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
Compute:
vCPU-to-memory ratio of 1:8.
Processors: Intel® Xeon® Platinum 8163 (Skylake) with a base frequency of 2.5 GHz.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSDs, ESSD AutoPLs, Zone-redundant ESSDs, SSD cloud disks, and ultra disks. For more information, see Block storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Network performance scales with the instance type.
The following table lists the instance types and specifications for the ebmgn6e instance family.
Instance type | vCPU | Memory (GiB) | GPU | GPU memory | Baseline bandwidth (Gbit/s) | PPS | Multi-queue | ENIs | Private IPv4 addresses | IPv6 addresses |
ecs.ebmgn6e.24xlarge | 96 | 768 | NVIDIA V100 × 8 | 32 GB × 8 | 32 | 4,800,000 | 16 | 15 | 10 | 1 |
Ebmgn6v instance family
Instance family description:
The ebmgn6v instance family is built on the X-Dragon architecture. These instances offer high-performance, software-defined hardware computing with cloud elasticity.
Powered by NVIDIA V100 GPUs.
The V100 (SXM2 package) GPU accelerator features the following:
Innovative Volta architecture
16 GB of HBM2 GPU memory per GPU, with a memory bandwidth of 900 GB/s
5,120 CUDA Cores per GPU
640 Tensor Cores per GPU
Each GPU supports six NVLink bidirectional links, with a unidirectional bandwidth of 25 GB/s per link, for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300 GB/s)
Use cases:
Deep learning training and inference workloads, such as image classification, autonomous driving, and speech recognition
Scientific computing workloads, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis
Compute:
vCPU-to-memory ratio of 1:4.
Processor: Intel ® Xeon ® Platinum 8163 (Skylake) with a 2.5 GHz base frequency.
Storage:
-
These are I/O optimized instances.
Supported cloud disk types: ESSD, ESSD AutoPL, Zone-redundant ESSD, standard SSD, and Ultra Disk. For more information, see Block Storage overview.
-
Network:
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Network performance scales with the instance type.
The following table lists the instance types and specifications for the ebmgn6v instance family.
Instance type | vCPU | Memory (GiB) | GPU | GPU memory | Network bandwidth (Gbit/s) | Network PPS | Multi-queue | ENIs | IPv4s per ENI | IPv6s per ENI |
ecs.ebmgn6v.24xlarge | 96 | 384 | NVIDIA V100 × 8 | 16 GB × 8 | 30 | 4,500,000 | 8 | 32 | 10 | 1 |
GPU compute instance family ebmgn6i
Introduction
ebmgn6i is an instance type family that is built on the X-Dragon Architecture, delivers software-defined hardware computing, and combines flexible elasticity with powerful performance.
These instances are powered by NVIDIA T4 GPU accelerators with the following features:
Innovative Turing architecture
16 GB of GPU memory per GPU with a memory bandwidth of 320 GB/s
2,560 CUDA Cores per GPU
Up to 320 Turing Tensor Cores per GPU
Variable-precision Tensor Cores deliver 65 TFLOPS of FP16 performance, 130 INT8 TOPS, and 260 INT4 TOPS.
Use cases
AI (DL/ML) inference, including applications such as computer vision, speech recognition, speech synthesis, NLP, machine translation, and recommendation systems.
Real-time cloud rendering for cloud gaming.
Real-time cloud rendering for AR/VR.
Compute-intensive graphics workloads or graphics workstations.
GPU-accelerated databases.
High-performance computing.
Compute
vCPU-to-memory ratio of 1:4.
Processor: Intel® Xeon® Platinum 8163 (Skylake) with a base frequency of 2.5 GHz.
Storage
-
These are I/O optimized instances.
Supported cloud disk types: ESSD cloud disk, ESSD AutoPL cloud disk, ESSD Zone-redundant cloud disk, SSD cloud disk, and Ultra Disk. For more information, see Block Storage Overview.
-
Network
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
Network performance scales with the instance type.
The following table lists the instance types and specifications for the ebmgn6i instance family.
Instance type | vCPU | Memory (GiB) | GPU | GPU memory | Network bandwidth (Gbit/s) | Forwarding rate (PPS) | Multiqueue | ENIs | Private IPv4 addresses | IPv6 addresses |
ecs.ebmgn6i.24xlarge | 96 | 384 | NVIDIA T4 × 4 | 16 GB × 4 | 30 | 4.5 million | 8 | 32 | 10 | 1 |
sccgn7ex, GPU-accelerated compute-optimized SCC instance family
-
Introduction: The sccgn7ex family provides high-bandwidth SCC instances. Alibaba Cloud developed this family to meet the growing demand for large-scale AI training. Multiple bare metal servers are interconnected through a third-generation RDMA SCC network. This network supports an interconnection bandwidth of 800 Gbit/s. You can scale the number of clusters based on your training needs to quickly meet the demands of large-scale AI parameter training.
-
Scenarios: Ultra-large-scale AI training.
-
Compute:
-
These instances support NVSwitch and provide up to 312 TFLOPS of computing power (TF32).
-
The processor-to-memory ratio is 1:8.
-
Processor: Third-generation Intel® Xeon® 8369 scalable processors (Ice Lake). These processors have a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and support the PCIe 4.0 interface.
-
-
Storage:
-
I/O optimized instance
-
Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
-
-
Network:
-
These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
-
Supports only VPC.
-
High network performance with a packet forwarding rate of 24 million PPS.
-
sccgn7ex instances support an interconnection bandwidth of 800 Gbit/s (4 × dual-port 100 Gbit/s RDMA). The instances support GPUDirect. Each GPU is directly connected to a 100 Gbit/s network port.
-
The following table lists the instance types and specifications for the sccgn7ex family.
|
Instance type |
vCPU |
Memory (GiB) |
GPU memory (GB) |
Base network bandwidth (Gbit/s) |
Packet forwarding rate (PPS) |
RoCE network (Gbit/s) |
Elastic Network Interfaces (ENIs) |
|
ecs.sccgn7ex.32xlarge |
128 |
1024 |
80 GB × 8 |
64 |
24 million |
800 |
15 |