GPU compute-optimized: gn, ebm, and scc-Elastic GPU Service(EGS)-阿里云帮助中心

gn9gc, GPU-accelerated compute-optimized instance family

Note

gn9gc is in invitational preview. To use gn9gc, submit a ticket.

Overview: gn9gc is Alibaba Cloud's 9th-generation cost-effective GPU cloud server instance family. It uses the latest-generation CIPU 2.0 to deliver cloud service capabilities, features high clock speed processors, and is configured with appropriate memory capacity. This instance family provides cost-effective instances for large language model (LLM) generation scenarios and video/image generation scenarios. The GPU can also directly provide graphics processing capabilities to support various rendering workloads.
Use cases:
- LLM inference: The new-generation GPU delivers compute power beyond the 8th generation with significantly improved memory bandwidth. Newly supported FP4 compute comprehensively improves inference performance and cost-effectiveness. Multi-GPU parallel inference efficiency is greatly enhanced.

Compute:

Uses the latest CIPU 2.0 cloud processor.
- The 2nd-generation CIPU provides higher cloud processing power with enhanced eRDMA, VPC, and EBS component capabilities. Supports containers (including but not limited to Docker, Clear Container, and Pouch).
Uses the new Blackwell architecture professional graphics card:
- Supports OpenGL professional-grade graphics processing.
- Supports RTX, TensorRT, and other common acceleration features, with newly upgraded FP4 support and PCIe Gen5 interconnect.

Key GPU specifications:

GPU architecture	GPU memory	Computing performance	Video encoding/decoding	Inter-GPU interconnect	Acceleration APIs
NVIDIA Blackwell	Capacity: 72 GB Bandwidth: 1,344 GB/s	TF32: 126 TFLOPS FP32: 52 TFLOPS FP16/BF16: 266 TFLOPS FP8/INT8: 530 TFLOPS FP4: 970 TFLOPS RT Core: 196 TFLOPS	3 x Video Encoder 3 x Video Decoder	PCIe interface: PCIe Gen5 x16 Bandwidth: 128 GB/s, P2P supported	DX12, OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, DirectCompute

Storage:
- I/O optimized.
- Supports the NVMe protocol. For more information, see NVMe protocol.
- Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPL disks, and regional ESSDs. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information about IPv6, see IPv6.
- Ultra-high network performance with up to 30 million PPS (8-GPU instances).
- Supports ERI (Elastic RDMA Interface) for RDMA direct acceleration over VPC networks, with bandwidth up to 360 Gbit/s. Suitable for autonomous driving, embodied intelligence, computer vision, and traditional model training workloads.
- Note
  For more information about ERI, see Enable eRDMA on an enterprise-level instance or Enable eRDMA on a GPU-accelerated instance.

The following table describes the instance types in the gn9gc instance family.

Instance type	vCPUs	Memory (GiB)	GPU memory	Baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	IPv4 addresses per ENI	IPv6 addresses per ENI	NIC queues (primary/secondary)	ENIs	Max data disks	Max disk bandwidth (GB/s)
ecs.gn9gc.4xlarge	16	128	72 GB × 1	16	3.6 million	30	30	8/32	8	1	1
ecs.gn9gc.8xlarge	32	192	72 GB × 1	32	7.5 million	30	30	16/64	8	1	1
ecs.gn9gc-2x.16xlarge	64	384	72 GB × 2	65	15 million	30	30	32/64	15	2	2
ecs.gn9gc-4x.32xlarge	128	768	72 GB × 4	131	30 million	50	50	64/64	15	4	4
ecs.gn9gc-8x.64xlarge	256	1,536	72 GB × 8	204	30 million	50	50	128/64	15	6	6

Note

Images used for gn9gc instances must be in the UEFI boot mode. If you want to use a custom image, make sure that the custom image supports UEFI boot mode and that the boot mode attribute of the image is set to UEFI. For more information, see Set the boot mode of a custom image to UEFI by calling API operations.

gn8v and gn8v-tee, GPU-accelerated compute-optimized instance family

These instance families are available in select regions, including those outside the Chinese mainland. To use them, contact your Alibaba Cloud sales representative.

Introduction:
- gn8v: An 8th-generation GPU-accelerated compute-optimized instance family from Alibaba Cloud for AI model training and inference on ultra-large language models (LLMs). This family provides instance types with one, two, four, or eight GPUs for various application requirements.
- gn8v-tee: To enhance security for large model training and inference, Alibaba Cloud offers gn8v-tee, an 8th-generation instance family based on gn8v with a confidential computing feature. These instances encrypt data during GPU computation to protect your data.
Use cases:
- Cost-effective for multi-GPU parallel inference on LLMs with more than 70 billion parameters.
- Each GPU provides 39.5 TFLOPS of FP32 compute power and delivers outstanding performance for traditional AI model training and autonomous driving training workloads.
- The eight GPUs support NVLink interconnectivity and are suitable for training small- to medium-sized models.
Features:
- High-speed, large-capacity GPU memory: Each GPU is equipped with 96 GB of HBM3 GPU memory and provides up to 4 TB/s of memory bandwidth, significantly accelerating model training and inference.
- High inter-GPU bandwidth: Multiple GPUs are interconnected with NVLink at 900 GB/s. This enables much higher efficiency for multi-GPU training and inference compared to previous-generation GPU instances.
- LLM quantization: Supports FP8 compute power, which optimizes performance for large-scale parameter training and inference. This significantly improves training and inference speeds and reduces GPU memory usage.
- (For gn8v-tee instances only) High security: Supports both CPU confidential computing with Intel® Trust Domain Extensions (TDX) and GPU confidential computing with NVIDIA Confidential Computing (CC). This provides end-to-end confidential computing for the entire model inference pipeline, protecting your inference data and enterprise models during model training and inference.
Compute:
- Powered by the latest CIPU 1.0.
  - Decouples compute from storage, letting you flexibly select the storage resources you need.
  - Provides bare metal capabilities, which support peer-to-peer (P2P) communication between GPU instances, unlike traditional virtualized instances.
- Powered by 4th-generation Intel® Xeon® Scalable processors with a base frequency of up to 2.8 GHz and an all-core turbo frequency of up to 3.1 GHz.
Storage:
- I/O-optimized instance.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported cloud disk types: elastic ephemeral disk, ESSD, ESSD AutoPL disks, and Regional ESSD. For more information about cloud disks, see block storage overview.
Network:
- Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- These instances support jumbo frames. For more information, see Jumbo frames.
- Delivers ultra-high network performance with a packet forwarding rate of up to 30 million pps (on 8-GPU instances).
- Supports elastic RDMA interface (ERI).
- Note
  For information about how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.
Security: Supports the trusted computing feature (vTPM). This feature is available on gn8v instances but not on gn8v-tee instances. For more information, see Overview of trusted computing capabilities.

The following table describes the instance types in the gn8v family.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	ENIs	Primary ENI queues	IPv4 addresses per ENI	IPv6 addresses per ENI	Max cloud disks	Baseline IOPS	Baseline bandwidth (GB/s)
ecs.gn8v.4xlarge	16	96	96 GB × 1	12	8	16	30	30	17	100,000	0.75
ecs.gn8v.6xlarge	24	128	96 GB × 1	15	8	24	30	30	17	120,000	0.937
ecs.gn8v-2x.8xlarge	32	192	96 GB × 2	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-4x.8xlarge	32	384	96 GB × 4	20	8	32	30	30	25	200,000	1.25
ecs.gn8v-2x.12xlarge	48	256	96 GB × 2	25	8	48	30	30	33	300,000	1.50
ecs.gn8v-8x.16xlarge	64	768	96 GB × 8	32	8	64	30	30	33	360,000	2.5
ecs.gn8v-4x.24xlarge	96	512	96 GB × 4	50	15	64	30	30	49	500,000	3
ecs.gn8v-8x.48xlarge	192	1024	96 GB × 8	100	15	64	50	50	65	1,000,000	6

The following table describes the instance types in the gn8v-tee family.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	ENIs	Primary ENI queues	IPv4 addresses per ENI	IPv6 addresses per ENI	Max cloud disks	Baseline IOPS	Baseline bandwidth (GB/s)
ecs.gn8v-tee.4xlarge	16	96	96 GB × 1	12	8	16	30	30	17	100,000	0.75
ecs.gn8v-tee.6xlarge	24	128	96 GB × 1	15	8	24	30	30	17	120,000	0.937
ecs.gn8v-tee-8x.16xlarge	64	768	96 GB × 8	32	8	64	30	30	33	360,000	2.5
ecs.gn8v-tee-8x.48xlarge	192	1024	96 GB × 8	100	15	64	50	50	65	1,000,000	6

Note

The gn8v-tee instance family supports only Alibaba Cloud Linux 3 images. If you use a custom image built on Alibaba Cloud Linux 3 to create an instance, ensure the kernel version is 5.10.134-18 or later.

gn8is, GPU-accelerated compute-optimized instance family

This instance family is available in select regions, including those outside the Chinese mainland. To use this instance family, contact your Alibaba Cloud sales representative.

Introduction: gn8is is Alibaba Cloud's eighth-generation GPU-accelerated compute-optimized instance family, designed for the growing demands of AI-generated content (AIGC). Powered by the latest NVIDIA L20 GPUs, this family offers instance types with one, two, four, or eight GPUs, and various CPU-to-GPU ratios to meet diverse application needs.
Features:
- Graphics processing: Powered by 4th-generation Intel® Xeon® Scalable high-frequency processors, these instances provide robust CPU compute power for 3D modeling scenarios, ensuring smoother graphics rendering and design workflows.
- Inference tasks: Equipped with the new NVIDIA L20 GPU, each with 48 GB of GPU memory, these instances accelerate inference tasks. They support the FP8 floating-point format and can be paired with Container Service for Kubernetes (ACK) to flexibly run inference for various AIGC models. They are especially suitable for inference tasks on large language models (LLMs) with fewer than 70 billion parameters.
Use cases:
- Use GRID drivers with images from Alibaba Cloud Marketplace to enable OpenGL and Direct3D capabilities. This provides workstation-grade graphics processing for workloads such as animation, film and television special effects, and rendering.
- Use the container management capabilities of Container Service for Kubernetes (ACK) for more efficient and cost-effective AIGC image generation and LLM inference.
- Other general-purpose AI applications, such as image recognition and speech recognition.

Compute:

Powered by the new NVIDIA L20 enterprise-grade GPUs.
- Supports common acceleration features such as TensorRT and the FP8 floating-point format to improve model inference performance.
- Up to 48 GB of GPU memory per GPU. With multiple GPUs, instances in this family support single-instance inference for models with 70 billion or more parameters.
- Enhanced graphics processing capabilities. After you install a GRID driver using Cloud Assistant or an image from Alibaba Cloud Marketplace, the graphics processing performance is twice that of 7th-generation platforms.

Key parameters of the NVIDIA L20 GPU:

GPU architecture	GPU memory	Compute performance	Video encoding/decoding	Inter-GPU connectivity
NVIDIA Ada Lovelace	Capacity: 48 GB Bandwidth: 864 GB/s	FP64: N/A FP32: 59.3 TFLOPS FP16/BF16: 119 TFLOPS FP8/INT8: 237 TFLOPS	3 × Video Encoders (+AV1) 3 × Video Decoders 4 × JPEG Decoders	PCIe interface: PCIe Gen4 x16 Bandwidth: 64 GB/s

Processor: Powered by the latest high-frequency Intel^® Xeon^® processors with an all-core turbo frequency of up to 3.9 GHz to handle complex 3D modeling demands.

Storage:
- All instances in this family are I/O-optimized instances.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information about cloud disks, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- Supports Elastic RDMA Interface (ERI).
  
  Note
  For details on using ERI, see Enable eRDMA for enterprise-level instances or Enable eRDMA for GPU-accelerated instances.
Security: These instances support the vTPM feature. For more information, see Overview of trusted computing.

The following table describes the instance types and specifications for the gn8is family.

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	ENIs	Primary ENI queues	Private IPv4 addresses	IPv6 addresses	Max cloud disks	Disk IOPS	Disk bandwidth (GB/s)
ecs.gn8is.2xlarge	8	64	L20 × 1	48 GB × 1	8	4	8	15	15	17	60,000	0.75
ecs.gn8is.4xlarge	16	128	L20 × 1	48 GB × 1	16	8	16	30	30	17	120,000	1.25
ecs.gn8is-2x.8xlarge	32	256	L20 × 2	48 GB × 2	32	8	32	30	30	33	250,000	2
ecs.gn8is-4x.16xlarge	64	512	L20 × 4	48 GB × 4	64	8	64	30	30	33	450,000	4
ecs.gn8is-8x.32xlarge	128	1024	L20 × 8	48 GB × 8	100	15	64	50	50	65	900,000	8

gn7e, GPU-accelerated compute-optimized instance family

Features of the gn7e instance family include:

Overview:
- this instance family lets you select instance types with different numbers of GPUs and CPU resources to meet your various AI business needs.
- Built on the third-generation X-Dragon architecture, gn7e instances deliver double the average network bandwidth for VPCs and cloud disks compared to the previous generation.
Use cases:
- Small- and medium-scale AI training workloads.
- High-performance computing (HPC) workloads accelerated using CUDA.
- AI inference workloads that require high GPU compute performance or large GPU memory.
- Deep learning, such as training AI algorithms for image classification, autonomous driving, and speech recognition.
- GPU-intensive scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
Important
When you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
Storage:
- All instances in this family are I/O optimized.
- Supported cloud disk types: ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Intra-city Redundant cloud disks. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information, see IPv6 communication.
- Network performance scales with the instance type. Larger instance types offer better network performance.

The gn7e instance family includes the instance types and specifications described in the following table.

Instance type	vCPUs	Memory (GiB)	GPU memory	Baseline bandwidth (Gbit/s)	Forwarding rate (pps)	Queues	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.gn7e-c16g1.4xlarge	16	125	80 GB × 1	8	3,000,000	8	8	10	1
ecs.gn7e-c16g1.8xlarge	32	250	80 GB × 2	16	6,000,000	16	8	10	1
ecs.gn7e-c16g1.16xlarge	64	500	80 GB × 4	32	12,000,000	32	8	10	1
ecs.gn7e-c16g1.32xlarge	128	1000	80 GB × 8	64	24,000,000	32	16	15	1

gn7i, GPU-accelerated compute-optimized instance family

Overview: Powered by the third-generation SHENLONG architecture, gn7i instances deliver stable and predictable high performance. They use chip-level fast path acceleration to increase storage performance, network performance, and compute stability by an order of magnitude.
Use cases:
- Equipped with high-performance CPUs, memory, and GPUs, these instances are ideal for concurrent AI inference tasks, such as image recognition, speech recognition, and behavior recognition.
- These instances support RTX features and use high-frequency CPUs to deliver high-performance 3D graphics virtualization. They are suitable for graphics-intensive workloads, such as remote graphics design and cloud gaming.
Compute:
- Equipped with NVIDIA A10 GPUs that feature:
  - The innovative NVIDIA Ampere architecture.
  - Support for common acceleration features such as RTX and TensorRT.
- Processor: 2.9 GHz Intel ^® Xeon ^® Scalable (Ice Lake) processor with an all-core turbo frequency of 3.5 GHz.
- This instance family provides up to 752 GiB of memory, a significant increase compared to the gn6i instance family.
Storage:
- All instances in this family are I/O optimized.
- Supported cloud disk types: ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Zone-redundant cloud disks. For more information, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For more information, see IPv6 communication.
- Network performance scales with the instance type. Larger instance types offer better network performance.

The gn7i instance family includes the following instance types and specifications.

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Packet rate (PPS)	NIC queues	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.gn7i-c8g1.2xlarge	8	30	NVIDIA A10 * 1	24 GB * 1	16	1,600,000	8	4	15	15
ecs.gn7i-c16g1.4xlarge	16	60	NVIDIA A10 * 1	24 GB * 1	16	3,000,000	8	8	30	30
ecs.gn7i-c32g1.8xlarge	32	188	NVIDIA A10 * 1	24 GB * 1	16	6,000,000	12	8	30	30
ecs.gn7i-c32g1.16xlarge	64	376	NVIDIA A10 * 2	24 GB * 2	32	12,000,000	16	15	30	30
ecs.gn7i-c32g1.32xlarge	128	752	NVIDIA A10 * 4	24 GB * 4	64	24,000,000	32	15	30	30
ecs.gn7i-c48g1.12xlarge	48	310	NVIDIA A10 * 1	24 GB * 1	16	9,000,000	16	8	30	30
ecs.gn7i-c56g1.14xlarge	56	346	NVIDIA A10 * 1	24 GB * 1	16	10,000,000	16	8	30	30
ecs.gn7i-2x.8xlarge	32	128	NVIDIA A10 * 2	24 GB * 2	16	6,000,000	16	8	30	30
ecs.gn7i-4x.8xlarge	32	128	NVIDIA A10 * 4	24 GB * 4	32	6,000,000	16	8	30	30
ecs.gn7i-4x.16xlarge	64	256	NVIDIA A10 * 4	24 GB * 4	64	12,000,000	32	8	30	30
ecs.gn7i-8x.32xlarge	128	512	NVIDIA A10 * 8	24 GB * 8	64	24,000,000	32	16	30	30
ecs.gn7i-8x.16xlarge	64	256	NVIDIA A10 * 8	24 GB * 8	32	12,000,000	32	8	30	30

Important

You can change instances of the types ecs.gn7i-2x.8xlarge, ecs.gn7i-4x.8xlarge, ecs.gn7i-4x.16xlarge, ecs.gn7i-8x.32xlarge, and ecs.gn7i-8x.16xlarge to ecs.gn7i-c8g1.2xlarge or ecs.gn7i-c16g1.4xlarge. However, you cannot change them to other instance types such as ecs.gn7i-c32g1.8xlarge.

gn7s, GPU-accelerated compute-optimized instance family

To use the gn7s instance family, submit a ticket.

Introduction:
- This instance family is powered by the latest Intel Ice Lake processors and NVIDIA A30 GPUs based on the NVIDIA Ampere architecture. This family offers various instance types with different GPU and CPU configurations to meet your specific AI needs.
- Built on Alibaba Cloud's third-generation SHENLONG architecture, gn7s instances deliver twice the average network bandwidth for VPCs and cloud disks as the previous generation.
Use cases: Featuring high-performance CPUs, memory, and GPUs, these instances are ideal for concurrent AI inference workloads, such as image recognition, speech recognition, and behavior identification.
Compute:
- Features NVIDIA A30 GPUs, which include:
  - The innovative NVIDIA Ampere architecture.
  - Support for the Multi-Instance GPU (MIG) feature and acceleration based on second-generation Tensor Cores for a wide range of workloads.
- Processor: 2.9 GHz Intel ^® Xeon ^® Scalable (Ice Lake) processor with an all-core turbo frequency of 3.5 GHz.
- Offers significantly more memory than the previous-generation instance family.
Storage:
- All instances in this family are I/O optimized.
- Supported cloud disk types: ESSD, ESSD AutoPL, and Zone-redundant ESSD. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information, see IPv6 communication.
- Network performance scales with the instance type.

The gn7s instance family includes the following instance types and specifications:

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network bandwidth (Gbit/s)	Packet rate (pps)	Private IPv4s per ENI	IPv6s per ENI	Multi-queue	ENIs
ecs.gn7s-c8g1.2xlarge	8	60	NVIDIA A30 * 1	24GB * 1	16	1,600,000	5	1	8	4
ecs.gn7s-c16g1.4xlarge	16	120	NVIDIA A30 * 1	24GB * 1	16	3,000,000	5	1	8	8
ecs.gn7s-c32g1.8xlarge	32	250	NVIDIA A30 * 1	24GB * 1	16	6,000,000	5	1	12	8
ecs.gn7s-c32g1.16xlarge	64	500	NVIDIA A30 * 2	24GB * 2	32	12,000,000	5	1	16	15
ecs.gn7s-c32g1.32xlarge	128	1000	NVIDIA A30 * 4	24GB * 4	64	24,000,000	10	1	32	15
ecs.gn7s-c48g1.12xlarge	48	380	NVIDIA A30 * 1	24GB * 1	16	9,000,000	8	1	16	8
ecs.gn7s-c56g1.14xlarge	56	440	NVIDIA A30 * 1	24GB * 1	16	10,000,000	8	1	16	8

gn7, GPU-accelerated compute-optimized instance family

Scenarios:
- Deep learning, such as training AI algorithms used in image classification, autonomous driving, and speech recognition.
- GPU-intensive scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.

Storage:
- Instances are I/O optimized.
- Supports ESSD cloud disks, ESSD AutoPL cloud disks, and ESSD Zone-redundant cloud disks. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information about IPv6, see IPv6 communication.
- Network performance scales with the instance type.

The following table describes the instance types and specifications of the gn7 instance family.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	Packet rate (pps)	NIC queues	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.gn7-c12g1.3xlarge	12	94	40 GB × 1	4	2,500,000	4	8	10	1
ecs.gn7-c13g1.13xlarge	52	378	40 GB × 4	16	9,000,000	16	8	30	30
ecs.gn7-c13g1.26xlarge	104	756	40 GB × 8	30	18,000,000	16	15	10	1

gn7r, GPU-accelerated compute-optimized instance family

Overview:
- The gn7r is an enterprise-grade, multi-purpose instance family from Alibaba Cloud that combines an Arm processor with a GPU. It provides a cloud-native platform for developing and running Android-based applications, cloud phones, and cloud gaming services. Equipped with NVIDIA A16 GPUs, these instances provide cost-effective, multi-chip hardware transcoding comparable to ASIC-based platforms. The instances also support the CUDA compute architecture, enabling direct AI recognition and analysis on the GPU after decoding.
- Based on the third-generation SHENLONG architecture, these instances use the CIPU cloud processor to manage cloud resources, delivering stable, predictable, and ultra-high compute, storage, and network performance.
- These instances use NVIDIA A16 GPU accelerators for graphics acceleration, hardware transcoding, and AI services.
  
  Note
  Each NVIDIA A16 card contains four GA107 processing chips.
Use cases: Ideal for remote Android application services, such as cloud application standby, cloud gaming, cloud phones, Android data crawlers, video transcoding, video recognition, content review, and video editing.
Compute:
- Processor: 3.0 GHz Ampere^® Altra^® Max processors. The native Arm compute platform provides efficient performance and excellent app compatibility for Android servers.
Storage:
- I/O optimized instance.
- Supported cloud disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For more information, see IPv6 communication.

The gn7r instance family includes the following instance types and specifications.

Instance type	vCPUs	Memory (GiB)	GPU	Base network bandwidth (Gbit/s)	Packet forwarding PPS	Private IPv4 addresses	IPv6 addresses	NIC queues	ENIs
ecs.gn7r-c16g1.4xlarge	16	64	NVIDIA GA107 × 1	8	3,000,000	15	1	8	8

gn6i, GPU-accelerated compute-optimized instance family

Use cases:
- AI (deep learning and machine learning) inference for applications such as computer vision, speech recognition, speech synthesis, natural language processing (NLP), machine translation, and recommendation systems.
- Real-time rendering for cloud gaming.
- Real-time, cloud-based rendering for augmented reality (AR) and virtual reality (VR).
- Graphics-heavy computing or graphics workstations.
- GPU-accelerated databases.
- High-performance computing (HPC).
Compute:
- Equipped with NVIDIA T4 GPU accelerators, featuring:
  - The innovative NVIDIA Turing architecture.
  - 16 GB of memory per GPU with a memory bandwidth of 320 GB/s.
  - 2,560 CUDA cores per GPU.
  - Up to 320 Turing Tensor Cores per GPU.
  - Mixed-precision Tensor Cores that support 65 TFLOPS of FP16, 130 TOPS of INT8, and 260 TOPS of INT4.
- vCPU-to-memory ratio of approximately 1:4.
- Processor: 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake).
Storage:
- I/O-optimized instances.
- Supported disk types: ESSD, ESSD AutoPL disks, SSD disk, and ultra disk. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For details, see IPv6 communication.
- Network performance scales with the instance type.

The gn6i instance family includes the following instance types.

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network bandwidth (Gbit/s)	Packet rate (pps)	Disk IOPS	Multi-queue	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.gn6i-c4g1.xlarge	4	15	NVIDIA T4 × 1	16 GB × 1	4	2,500,000	N/A	2	2	10	1
ecs.gn6i-c8g1.2xlarge	8	31	NVIDIA T4 × 1	16 GB × 1	5	2,500,000	N/A	2	2	10	1
ecs.gn6i-c16g1.4xlarge	16	62	NVIDIA T4 × 1	16 GB × 1	6	2,500,000	N/A	4	3	10	1
ecs.gn6i-c24g1.6xlarge	24	93	NVIDIA T4 × 1	16 GB × 1	7.5	2,500,000	N/A	6	4	10	1
ecs.gn6i-c40g1.10xlarge	40	155	NVIDIA T4 × 1	16 GB × 1	10	2,500,000	N/A	16	10	10	1
ecs.gn6i-c24g1.12xlarge	48	186	NVIDIA T4 × 2	16 GB × 2	15	4,500,000	N/A	12	6	10	1
ecs.gn6i-c24g1.24xlarge	96	372	NVIDIA T4 × 4	16 GB × 4	30	4,500,000	250,000	24	8	10	1

gn6e, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning applications, such as training and inference for AI algorithms for image classification, autonomous driving, and speech recognition.
- Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
Compute:
- Features NVIDIA V100 (32 GB NVLink) GPU cards.
- GPU accelerator: V100 (SXM2 package).
  - Innovative NVIDIA Volta architecture.
  - 32 GB of HBM2 memory per GPU with a GPU memory bandwidth of 900 GB/s.
  - 5,120 CUDA Cores per GPU.
  - 640 Tensor Cores per GPU.
  - Each GPU supports six bidirectional NVLink connections, each providing 25 Gbit/s of bandwidth in each direction for a total of 300 Gbit/s.
- Features a vCPU-to-memory ratio of approximately 1:8.
- Processor: 2.5 GHz Intel ^® Xeon ^® Platinum 8163 (Skylake).
Storage:
- I/O optimized instance.
- Supported cloud disk types: ESSDs, ESSD AutoPL disks, Regional ESSDs, standard SSDs, and ultra disks. For more information, see Elastic Block Storage.
Network:
- Supports both IPv4 and IPv6. For more information about IPv6 communication, see IPv6 communication.
- Network performance scales with the instance type.

gn6e includes the instance types and specifications listed in the table below.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Baseline bandwidth (Gbit/s)	Packet rate (PPS)	NIC queues	ENI	Private IPv4 addresses	IPv6 addresses
ecs.gn6e-c12g1.3xlarge	12	92	1 × NVIDIA V100	1 × 32 GB	5	800,000	8	6	10	1
ecs.gn6e-c12g1.6xlarge	24	184	2 × NVIDIA V100	2 × 32 GB	8	1,200,000	8	8	20	1
ecs.gn6e-c12g1.12xlarge	48	368	4 × NVIDIA V100	4 × 32 GB	16	2,400,000	8	8	20	1
ecs.gn6e-c12g1.24xlarge	96	736	8 × NVIDIA V100	8 × 32 GB	32	4,500,000	16	8	20	1

gn6v, GPU-accelerated compute-optimized instance family

Use cases:
- Deep learning applications, such as training and inference for AI algorithms in image classification, autonomous driving, and speech recognition.
- Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
Compute:
- Equipped with NVIDIA V100 GPUs.
- GPU accelerator: V100 (SXM2 package).
  - Innovative NVIDIA Volta architecture.
  - 16 GB of HBM2 GPU memory per GPU with 900 GB/s of memory bandwidth.
  - 5,120 CUDA Cores per GPU.
  - 640 Tensor Cores per GPU.
  - Up to six NVLink bidirectional connections per GPU. Each connection provides a bandwidth of 25 Gbit/s in each direction, for a total bandwidth of 300 Gbit/s.
- Features a vCPU-to-memory ratio of approximately 1:4.
- Processor: 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake).
Storage:
- All instances in this family are I/O optimized.
- Supported disk types: ESSD, ESSD AutoPL, SSD Cloud Disk, and Ultra Disk. For more information, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For more information, see IPv6 communication.
- Network performance scales with the instance type.

The gn6v instance family includes the instance types and specifications listed below.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Packet rate (pps)	Disk baseline IOPS	Multi-queue	ENI	Private IPv4 addresses	IPv6 addresses
ecs.gn6v-c8g1.2xlarge	8	32	1 × NVIDIA V100	1 × 16 GB	2.5	800,000	N/A	4	4	10	1
ecs.gn6v-c8g1.4xlarge	16	64	2 × NVIDIA V100	2 × 16 GB	5	1,000,000	N/A	4	8	20	1
ecs.gn6v-c8g1.8xlarge	32	128	4 × NVIDIA V100	4 × 16 GB	10	2,000,000	N/A	8	8	20	1
ecs.gn6v-c8g1.16xlarge	64	256	8 × NVIDIA V100	8 × 16 GB	20	2,500,000	N/A	16	8	20	1
ecs.gn6v-c10g1.20xlarge	82	336	8 × NVIDIA V100	8 × 16 GB	35	4,500,000	250,000	16	8	20	1

GPU elastic bare metal server instance family ebmgn9g

Important

The ebmgn9g instance family is available by invitation only. To request access, submit a ticket.

Instance family introduction: The ebmgn9g instance family is Alibaba Cloud's ninth-generation line of cost-effective, full-featured GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large memory capacity, and new Blackwell architecture professional GPUs. They deliver high-performance and cost-effective GPU cloud services for a wide range of accelerated workloads, such as training for autonomous driving and embodied intelligence, large model inference, film and animation rendering, and services for the metaverse and cloud gaming.
Use cases and features:
- Autonomous driving/embodied intelligence:
  With 256 vCPUs running at an all-core frequency of up to 4.2 GHz and 2,304 GiB of memory, these instances handle the demanding data processing requirements for autonomous driving and embodied intelligence training.
- Search and recommendation:
  The Blackwell GPU provides 126 TFLOPS of TF32 computing power. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, making it an optimal configuration for search and advertising services.
- Large model inference:
  The new-generation GPU offers a significant performance leap over its predecessor, with video memory bandwidth increased to 1,344 GB/s. Support for FP4 computing power significantly improves inference performance and cost-effectiveness. The eight GPUs are interconnected via PCIe Gen5 with a bandwidth of 128 GB/s. This connection dramatically improves multi-card parallel inference efficiency.
- Cloud gaming/rendering/metaverse:
  With a CPU frequency of up to 5 GHz, these instances are an excellent choice for 3D modeling. The GPU provides certified, workstation-grade graphics drivers with full OpenGL acceleration, making these instances the optimal choice for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The second-generation CIPU offers increased processing power, enhancing the performance of eRDMA, VPC, and cloud disk components. Bare metal instances are ideal for workloads that require direct access to physical resources or have requirements such as hardware-based licensing. They also support containers, including but not limited to Docker, Clear Containers, and PouchContainer.

Compute:

Powered by new Blackwell architecture professional GPUs:
- Supports professional-grade OpenGL graphics processing.
- Supports common acceleration features such as RTX and TensorRT, with newly added support for FP4 and PCIe Gen5 interconnect.
- Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-card sharding.

Key GPU specifications:

GPU architecture

Video memory

Compute performance

Encode/decode engines

Inter-GPU interconnect

Acceleration APIs

Blackwell

Capacity: 48 GB
Bandwidth: 1,344 GB/s

TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT core: 196 TFLOPS

3 × Video Encoder
3 × Video Decoder

PCIe Gen5 x16: 128 GB/s
Supports P2P

Supports DX12,

OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute

Processor: AMD Turin-C processors with a frequency range of 3.3 GHz to 5 GHz and an all-core turbo of up to 4.2 GHz.

Storage:
- I/O-optimized instance.
- Supports the NVMe protocol. For more information, see NVMe protocol overview.
- Supported cloud disk types: elastic temporary disk, ESSD, ESSD AutoPL, and ESSD with zone-redundancy. For more information about block storage, see Block storage overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers network performance of up to 30 million Network PPS.
- Supports eRDMA, which enables RDMA pass-through acceleration over VPC networks. eRDMA increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
  Note
  For instructions on how to use eRDMA, see Enable eRDMA for an enterprise-level instance or Enable eRDMA for a GPU instance.

The following table lists the specifications for the ebmgn9g instance family.

Instance type	vCPU	Memory (GiB)	Video memory	Network bandwidth (Gbit/s)	Network PPS	Private IPv4 addresses	IPv6 addresses	Queues (primary/secondary)	ENIs	Max data disks	Cloud disk bandwidth (GB/s)
ecs.ebmgn9g.64xlarge	256	2304	48 GB × 8	360 (180 × 2)	30 million	30	30	64/16	38	33	8

Note

ebmgn9g instances require images that use the UEFI boot mode. If you use a custom image, ensure it supports the UEFI boot mode and that its boot mode property is set to UEFI. For more information, see Instance boot mode.

GPU elastic bare metal server instance family ebmgn9ge

Important

The ebmgn9ge instance family is available by invitation only. To request access, submit a ticket.

Introduction: The ebmgn9ge instance family is Alibaba Cloud's 9th generation of full-featured, cost-effective GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large memory capacity, and new Blackwell architecture professional GPUs. They deliver high-performance and cost-effective GPU cloud services for a wide range of accelerated workloads, including autonomous driving and embodied intelligence training, large model inference, film and animation rendering, and metaverse and cloud gaming services.
Use cases and features:
- Autonomous driving and embodied intelligence:
  With 256 vCPUs running at an all-core frequency of over 4.2 GHz and 2.3 TB of memory, these instances meet the demanding data processing needs of autonomous driving and embodied intelligence training.
- Search and recommendation:
  The Blackwell GPU provides 126 TFLOPS of TF32 computing power. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, providing an optimal configuration for search and advertising services.
- Large model inference:
  The ebmgn9ge instance family is specifically designed for large language models, offering 72 GB of GPU memory per GPU. The GPU memory bandwidth reaches 1,344 GB/s, delivering high-performance inference for LLM scenarios. Combined with the new FP4 compute architecture and 128 GB/s of PCIe Gen5 bandwidth, an instance can support parallel inference for models larger than 671 billion parameters across 8 GPUs.
- Cloud gaming, rendering, and metaverse:
  With a CPU frequency of up to 5 GHz, these instances are well-suited for 3D modeling. The GPU natively supports graphics capabilities and provides certified workstation-grade graphics drivers with full OpenGL acceleration, making them ideal for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The 2nd generation CIPU delivers greater cloud processing power, enhancing performance for eRDMA, VPC, and EBS components. Bare metal instances allow workloads to directly access physical resources or meet requirements such as hardware-based licensing. They also support containers, such as Docker, Clear Containers, and Pouch.

Compute:

Powered by new Blackwell architecture professional GPUs:
- Supports professional-grade OpenGL graphics processing.
- Supports common acceleration features like RTX and TensorRT, and adds support for FP4 compute and the PCIe Gen5 interconnect.
- Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-GPU sharding.

GPU key specifications:

GPU architecture

GPU memory

Compute performance

Video encode/decode engines

Inter-GPU interconnect

Accelerated APIs

Blackwell

Capacity: 72 GB
Bandwidth: 1,344 GB/s

TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT core: 196 TFLOPS

3 × Video Encoder
3 × Video Decoder

PCIe Gen5 x16: 128 GB/s
Supports P2P

Supports DX12,

OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute

Processor: AMD Turin-C processors with a frequency range of 3.3 GHz to 5 GHz and an all-core turbo of up to 4.2 GHz.

Storage:
- I/O-optimized instance.
- Supports the NVMe protocol. For more information, see NVMe Protocol Overview.
- Supported cloud disk types: elastic temporary disk, ESSD, ESSD AutoPL disks, and ESSD Zone-Redundant Disk. For more information about block storage, see Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 Communication.
- Delivers network performance of up to 30 million network packets per second (PPS).
- Supports ERI (Elastic RDMA Interface), which enables RDMA pass-through acceleration over VPC networks. ERI increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
  Note
  For instructions on how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.

The following table lists the instance types and specifications for the ebmgn9ge instance family.

Note

For a lower-cost version with less memory, you can select ebmgn9gc.

Instance type	vCPU	Memory (GiB)	GPU memory	Baseline network bandwidth (Gbit/s)	Network PPS	Private IPv4 addresses per ENI	IPv6 addresses per ENI	Queue pairs (primary/secondary ENI)	ENIs	Max attachable data disks	Max cloud disk bandwidth (GB/s)
ecs.ebmgn9ge.64xlarge	256	2,304	72 GB × 8	360 (180 × 2)	30 million	30	30	64/16	38	33	8

Note

Instances of the ebmgn9ge instance family require an image that uses the UEFI boot mode. If you use a custom image, ensure that the image supports the UEFI boot mode and that its boot mode property is set to UEFI. For more information, see Instance Boot Mode.

GPU-accelerated elastic bare metal instance family ebmgn9gc

Important

The ebmgn9gc instance family is available by invitation only. To request access, submit a ticket.

Introduction: The ebmgn9gc instance family is Alibaba Cloud's 9th generation of full-featured, cost-effective GPU bare metal instances. Powered by the latest CIPU 2.0, these instances combine high-frequency CPUs, large-capacity memory, and new Blackwell architecture professional GPUs to deliver cost-effective GPU cloud services for a wide range of accelerated workloads, including autonomous driving and embodied intelligence training, large model inference, film and animation rendering, and metaverse and cloud gaming services.
Use cases and features:
- Autonomous driving and embodied intelligence:
  Offers 256 vCPUs with a frequency range of 3.3 GHz to 5.0 GHz and an all-core turbo of up to 4.2 GHz. Paired with 1.5 TB of memory, these instances meet the data processing demands of autonomous driving and embodied intelligence training.
- Search and recommendation:
  The equipped Blackwell GPUs provide 126 TFLOPS of high-performance TF32 compute. Each GPU is paired with 32 vCPUs and 153 GB/s of memory bandwidth, providing an optimal configuration for search and advertising services.
- Large model inference:
  The ebmgn9gc instances are designed for large language models. With 72 GB of GPU memory and 1,344 GB/s of memory bandwidth per GPU, they deliver high-performance inference for LLM scenarios. Combined with the new FP4 compute architecture and 128 GB/s of PCIe Gen5 bandwidth, an 8-card instance can support parallel inference for models with over 671 billion parameters.
- Cloud gaming, rendering, and metaverse:
  With a CPU frequency of up to 5.0 GHz, these instances are well-suited for 3D modeling. The GPUs natively support graphics capabilities and provide certified workstation-grade graphics drivers with full OpenGL acceleration, making them a good choice for high-end film and animation development and CAD design.
Powered by the latest CIPU 2.0:
The 2nd generation CIPU offers higher cloud processing power and enhanced computing capabilities for eRDMA, VPC, and block storage. Bare metal instances allow workloads to directly access physical resources or meet requirements such as hardware-based licensing. They also support containers, such as Docker, Clear Container, and Pouch.

Compute:

Powered by new Blackwell architecture professional GPUs:
- Supports professional-grade OpenGL graphics processing features.
- Supports common acceleration features such as RTX and TensorRT, and adds support for FP4 and PCIe Gen5 interconnect.
- Utilizes a PCIe switch for interconnection, which improves NCCL performance by 36% compared to a direct-to-CPU connection. This can boost performance by up to 9% for large model inference with multi-card sharding.

GPU key parameters:

GPU architecture

GPU memory

Compute performance

Video encoding/decoding

GPU-to-GPU interconnect

Acceleration APIs

Blackwell

Capacity: 72 GB
Bandwidth: 1,344 GB/s

TF32: 126 TFLOPS
FP32: 52 TFLOPS
FP16/BF16: 266 TFLOPS
FP8/INT8: 533 TFLOPS
FP4: 970 TFLOPS
RT Core: 196 TFLOPS

3 × Video Encoder
3 × Video Decoder

PCIe Gen5 x16: 128 GB/s
Supports P2P

Supports DX12,

OpenGL 4.6, Vulkan 1.3, CUDA 12.8, OpenCL 3.0, and DirectCompute

Processor: AMD Turin-C processor with a frequency range of 3.3 GHz to 5.0 GHz and an all-core turbo of up to 4.2 GHz.

Storage:
- I/O optimized instance.
- Supports the NVMe protocol. For more information, see NVMe protocol overview.
- Supported cloud disk types: Elastic Temporary Disk, ESSD, ESSD AutoPL, and ESSD Zone-Redundant Disk. For more information about block storage, see Block Storage Overview.
Network:
- Supports IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers network performance of up to 30 million packets per second (PPS).
- Supports elastic RDMA (eRDMA), which enables RDMA pass-through acceleration over VPC networks. eRDMA increases bandwidth to 360 Gbit/s and is ideal for training workloads in autonomous driving, embodied intelligence, computer vision (CV), and traditional models.
  Note
  For instructions on how to use eRDMA, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.

The following table lists the instance types and specifications for the ebmgn9gc instance family.

Note

For a version with more memory, you can select ebmgn9ge.

Instance type	vCPU	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	Network PPS	Private IPv4 addresses	IPv6 addresses	Multi-queue (primary/secondary)	Elastic network interfaces	Maximum data disks	Cloud disk bandwidth (GB/s)
ecs.ebmgn9gc.64xlarge	256	1536	72 GB × 8	360 (180 × 2)	30,000,000	30	30	64/16	38	33	8

Note

Instances in the ebmgn9gc instance family must be launched from an image that uses the UEFI boot mode. If you use a custom image, ensure that it supports the UEFI boot mode and its boot mode property is set to UEFI. For more information, see Instance boot mode.

GPU compute-optimized elastic bare metal server instance family ebmgn8v

This instance family is currently available only in select regions. To request access, contact your Alibaba Cloud sales representative.

Instance family overview: ebmgn8v is Alibaba Cloud's 8th-generation accelerated computing instance family (Elastic Bare Metal Server instance family), designed for AI model training and ultra-large parameter models. Each instance is a bare metal host equipped with eight GPU cards.
Use cases:
- Cost-effective for multi-GPU parallel inference on large language models (LLMs) with more than 70 billion parameters.
- Each GPU delivers 39.5 TFLOPS of FP32 compute, making it ideal for traditional AI model training and autonomous driving workloads.
- NVLINK interconnection among the eight GPUs supports small- and medium-sized model training scenarios.
Key features and positioning:
- High-speed, large-capacity GPU memory: Each GPU has 96 GB of HBM3 memory, delivering a memory bandwidth of 4 TB/s. This significantly accelerates model training and inference.
- High inter-GPU bandwidth: The GPUs are interconnected via 900 GB/s NVLINK, providing significantly higher efficiency for multi-GPU training and inference compared to previous GPU generations.
- Large model quantization technology: Supports FP8 compute, which optimizes performance for training and inference with large-scale parameters. This significantly increases computational speed and reduces GPU memory usage.
Compute:
- Powered by the latest CIPU (Cloud Processing Unit) 1.0:
  - Decouples compute and storage, enabling flexible selection of storage resources. Compared to 7th-generation GPU instances, the inter-host bandwidth is increased to 160 Gbit/s for faster data transfer and processing.
  - Unlike traditional virtualized instances, the CIPU provides bare metal capabilities that support P2P communication between GPU instances.
- Powered by 4th Generation Intel® Xeon® Scalable processors, providing 192 vCPUs with an all-core turbo frequency of up to 3.1 GHz.
Storage:
- These are I/O optimized instances.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported cloud disk types: elastic temporary disks, ESSD cloud disks, ESSD AutoPL cloud disks, and zone-redundant ESSD cloud disks. For more information about cloud disks, see Block Storage Overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers ultra-high network performance of up to 30 million PPS.
- Supports Elastic RDMA Interface (ERI), which enables RDMA pass-through acceleration over VPC networks. ERI increases bandwidth to 160 Gbit/s and is ideal for computer vision (CV) and traditional model training workloads.
  Note
  For instructions on how to use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.

The following table describes the instance types and specifications of the ebmgn8v instance family.

Instance type	vCPU	Memory (GiB)	GPU memory	Baseline network bandwidth (Gbit/s)	Packet forwarding rate (PPS)	Private IPv4s per ENI	IPv6s per ENI	Multi-queue	ENIs	Max data disks	Max cloud disk bandwidth (GB/s)
ecs.ebmgn8v.48xlarge	192	1,024	96 GB × 8	170 (85 × 2)	30 million	30	30	64	32	31	6

Note

The boot mode for images used by ebmgn8v instances must be UEFI. If you use a custom image, ensure that the image supports the UEFI boot mode and that its boot mode attribute is set to UEFI. For more information, see Instance boot modes.

GPU-accelerated ebmgn8ia instance family

This instance family is currently available only in select regions. To request access, contact your Alibaba Cloud sales representative.

Introduction: The ebmgn8ia, an elastic bare metal instance family, is Alibaba Cloud's eighth-generation accelerated computing instance family. It is designed for search and recommendation, simulation, and other GPU-intensive workloads that require a high vCPU-to-GPU ratio. Powered by the latest NVIDIA L20 GPUs, each instance is a bare metal server equipped with two high-frequency processors and four GPUs.
Features and use cases:
- High frequency: These instances are equipped with two AMD EPYC™ Genoa 9T34 processors. Each processor has 64 physical cores, providing a total of 256 vCPUs per instance, with a frequency range of 3.4 GHz to 3.75 GHz. This significantly boosts single-core CPU performance, making it ideal for CAD modeling and accelerating preprocessing for CAE simulations.
- Sparse resource ratio: Each GPU is paired with 64 vCPUs and 384 GiB of memory, delivering an average memory bandwidth of 230 GB/s per GPU. This configuration is suitable for GPU computing scenarios that require high I/O throughput, such as advertising, search, recommendation, traditional CAE simulations, and certain CPU-based rendering tasks in film production.
Powered by the latest CIPU 1.0:
- Decouples compute and storage, allowing you to flexibly select storage resources. Compared to the previous generation, the inter-machine bandwidth of this instance type is increased to 160 Gbit/s for faster data transfer and processing.
- The CIPU provides bare metal capabilities, enabling PCIe P2P communication between GPU instances, an improvement over traditional virtualized instances.

Compute:

Powered by new enterprise-class NVIDIA L20 GPUs:
- Supports common acceleration features such as vGPU, RTX, and TensorRT.
- Supports FP8 precision to improve computational efficiency.

NVIDIA L20 key specifications:

GPU architecture	GPU memory	Compute performance	Video encode/decode engines	GPU interconnect
NVIDIA Ada Lovelace	Capacity: 48 GB Bandwidth: 864 GB/s	FP64: N/A FP32: 59.3 TFLOPS FP16/BF16: 119 TFLOPS FP8/INT8: 237 TFLOPS	3 × Video Encoder (+AV1) 3 × Video Decoder 4 × JPEG Decoder	PCIe interface: PCIe Gen4 x16 Bandwidth: 64 GB/s

Processor: AMD EPYC™ Genoa 9T34 processors with a frequency range of 3.4 GHz to 3.75 GHz.

Storage:
- These are I/O optimized instances.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported cloud disks: elastic temporary disk, ESSD cloud disk, ESSD AutoPL cloud disk, and ESSD Zone-redundant cloud disk. For more information, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers ultra-high network performance of up to 30,000,000 PPS.
- Supports Elastic RDMA Interface (ERI), which enables RDMA passthrough over VPC networks. ERI increases bandwidth to 160 Gbit/s and is ideal for training computer vision (CV) and traditional models.
  Note
  For instructions on using ERI, see Enable eRDMA on an enterprise-level instance or Enable eRDMA on a GPU-accelerated instance.

The following table lists the instance types and specifications for the ebmgn8ia instance family.

Instance type

vCPU

Memory (GiB)

GPU

GPU memory

Baseline network bandwidth (Gbit/s)

Network PPS

Private IPv4 addresses per ENI

IPv6 addresses per ENI

Queue number (Primary/Secondary ENI)

Elastic network interfaces

Max data disks

Max cloud disk bandwidth (GB/s)

ecs.ebmgn8ia.64xlarge

256

1536

L20 × 4

48 GB × 4

160 (80 × 2)

30,000,000

30

64/16

32

31

6

Note

Instances of the ebmgn8ia instance family require an image that uses the UEFI boot mode. If you use a custom image, it must support UEFI boot, and its boot mode property must be set to UEFI. For more information, see Instance boot mode.

GPU compute-optimized elastic bare metal family ebmgn8is

This instance family is currently available only in select regions outside the Chinese mainland. For inquiries, contact your Alibaba Cloud sales representative.

Instance family overview: The ebmgn8is is the 8th-generation accelerated computing instance family from Alibaba Cloud. As an elastic bare metal instance family, it is built for the growing demand in AI-generated workloads. Powered by the latest NVIDIA L20 GPUs, each instance is a bare metal host with eight GPU cards.
Features and positioning:
- Graphics processing: These instances are powered by high-frequency 4th Generation Intel® Xeon® Scalable processors. They provide ample CPU computing power for 3D modeling, ensuring smoother graphics rendering and design.
- Inference tasks: Powered by the new NVIDIA L20 GPUs, where each card provides 48 GB of memory to accelerate inference tasks. These instances support the FP8 floating-point format. Combined with Alibaba Cloud Container Service for Kubernetes (ACK), they flexibly support inference on various AIGC models and are especially suitable for inference tasks on large language models (LLMs) with up to 70 billion parameters.
- Training tasks: These instances offer cost-effective computing power, delivering double the FP32 compute performance compared to 7th-generation inference instances. They are particularly suitable for training computer vision (CV) models that use FP32, and for training other small to medium-sized models.
Use cases:
- Use GRID images with GRID graphics drivers from the Cloud Marketplace to enable OpenGL and Direct3D graphics capabilities. This provides workstation-grade graphics processing for workloads such as animation, film and television special effects, and rendering.
- Combined with the containerized management capabilities of ACK, you can more efficiently and cost-effectively support AIGC graphics generation and large language model (LLM) inference (up to 130 billion parameters).
- Other general AI scenarios, such as image recognition and speech recognition.
Powered by the latest CIPU 1.0:
- Features decoupled compute and storage, which allows you to flexibly select the required storage resources. Compared with the previous generation, the inter-instance bandwidth of this instance family is increased to 160 Gbit/s to enable faster data transfer and processing.
- CIPU provides bare metal capabilities that enable PCIe P2P communication between GPUs, which is not supported by traditional virtualized instances.

Compute:

Powered by new enterprise-class NVIDIA L20 GPUs:
- Supports common acceleration features such as vGPU, RTX, and TensorRT.
- Uses a PCIe switch for interconnection. Compared with a direct-to-CPU connection, this design improves NCCL performance by 36% and can increase inference performance by up to 9% when you run sharded inference for large models across multiple cards.

Key NVIDIA L20 parameters:

GPU architecture	GPU memory	Compute performance	Video encoding/decoding	Inter-GPU interconnect
NVIDIA Ada Lovelace	Capacity: 48 GB Bandwidth: 864 GB/s	FP64: N/A FP32: 59.3 TFLOPS FP16/BF16: 119 TFLOPS FP8/INT8: 237 TFLOPS	3 × Video Encoder (+AV1) 3 × Video Decoder 4 × JPEG Decoder	PCIe interface: PCIe Gen4 x16 Bandwidth: 64 GB/s

Processor: Intel^® Xeon^® Scalable processors (SPR) with a 3.4 GHz base frequency and an all-core turbo frequency of up to 3.9 GHz.

Storage:
- These are I/O optimized instances.
- These instances support the NVMe protocol. For more information, see Overview of the NVMe protocol.
- Supported cloud disk types: elastic ephemeral disks, ESSDs, ESSD AutoPLs, and zone-redundant ESSDs. For more information about cloud disks, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers ultra-high network performance with a packet forwarding rate of 30 million PPS.
- Supports ERI (Elastic RDMA Interface), which enables RDMA passthrough acceleration in a VPC network. ERI increases bandwidth to 160 Gbit/s and is ideal for CV and traditional model training workloads.
  Note
  To use ERI, see Enable on enterprise-level instance or Enable eRDMA for a GPU instance.

The following table describes the instance types and specifications of the ebmgn8is instance family.

Instance type

vCPUs

Memory (GiB)

GPU

GPU memory

Base bandwidth (Gbit/s)

Packet forwarding rate (PPS)

Private IPv4 addresses

IPv6 addresses

Queue pairs (primary/secondary ENI)

ENIs

Attachable data disks

Cloud disk bandwidth (GB/s)

ecs.ebmgn8is.32xlarge

128

1024

L20 × 8

48 GB × 8

160 (80 × 2)

30,000,000

30

64/16

32

31

6

Note

The image boot mode for ebmgn8is instance types must be UEFI. If you use a custom image, you must ensure that it supports UEFI and that its boot mode attribute is set to UEFI. For more information, see Instance boot modes.

GPU compute-optimized ECS bare metal instance family ebmgn7ex

Instance family description: The ebmgn7ex instance family provides high-bandwidth instances for large-scale AI training. Built on the fourth-generation SHENLONG architecture and Alibaba Cloud's new CIPU architecture, ebmgn7ex instances use an eRDMA network to interconnect multiple bare metal hosts, enabling RDMA communication with up to 160 Gbit/s of interconnect bandwidth. After enabling eRDMA, you can elastically scale instances in your cluster for large-scale AI training.
Use cases:
- Various deep learning training and development workloads.
- HPC-accelerated computing and simulation.
Important
When you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
Compute:
- Processor: 3rd Generation Intel^® Xeon^® Scalable processor (Icelake) with a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and PCIe 4.0 support.
Storage:
- These are I/O optimized instances.
- Supported disk types: elastic ephemeral disk, ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Block Storage Overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Supports physical NICs.
- Delivers ultra-high network performance with a packet forwarding rate of up to 24 million PPS.
- Supports Elastic RDMA Interface (ERI), which enables RDMA pass-through for accelerated communication within a VPC network. You can attach two ERIs to an instance. If each ERI is connected to a different network card index, the instance achieves a network bandwidth of 160 Gbit/s. If all ERIs are connected to the same network card index, the instance reaches a network bandwidth of up to 100 Gbit/s. For more information, see AttachNetworkInterface.
  Note
  For instructions on how to use ERIs, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.

The following table lists the instance types and specifications for the ebmgn7ex instance family.

Instance type	vCPUs	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	Packet rate (PPS)	Private IPv4s per NIC	IPv6s per NIC	Physical NICs	Multi-queue (primary/secondary)	ENIs
ecs.ebmgn7ex.32xlarge	128	1024	80 GB * 8	160 (80 * 2)	24 million	30	30	2	32/32	16

Note

Ebmgn7ex instances require an image that uses UEFI boot mode. If you use a custom image, ensure it supports UEFI boot mode and that its boot mode property is set to UEFI. For detailed steps, see Instance boot mode.

Ebmgn7e instance family

Instance family description: The ebmgn7e is an instance family built on the X-Dragon architecture. It combines powerful hardware performance with software-defined flexibility and elasticity.
Use cases:
- Deep learning training and development.
- HPC-accelerated computing and simulations.
Important
When you run AI training workloads with high communication loads, such as those involving Transformer models, you must enable NVLink for GPU-to-GPU communication. Otherwise, large-scale data transfers over the PCIe link may cause unexpected failures and data corruption. If you are unsure about the communication link topology for your training workload, submit a ticket for support from Alibaba Cloud technical experts.
Compute:
- Processor: Features an Intel^® Xeon^® Scalable processor with a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and support for PCIe 4.0.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSD cloud disk, ESSD AutoPL cloud disk, and ESSD Zone-redundant cloud disk. For more information about cloud disks, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Achieves a packet forwarding rate of up to 24 million PPS.

The following table lists the instance types and specifications for the ebmgn7e instance family.

Instance type	vCPU	Memory (GiB)	GPU memory	Network bandwidth (Gbit/s)	Network PPS	Multi-queue (primary/secondary)	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.ebmgn7e.32xlarge	128	1024	80 GB × 8	64	24 million	32/12	32	10	1

After starting an ebmgn7e instance, you must check if the Multi-Instance GPU (MIG) feature is enabled or disabled because its default state is not guaranteed. For more information about MIG, see NVIDIA Multi-Instance GPU User Guide.

The following table describes whether ebmgn7e instances support the MIG feature.

Instance type	MIG feature support	Description
ecs.ebmgn7e.32xlarge	Yes	You can enable the MIG feature on ebmgn7e bare metal instances.

GPU-accelerated elastic bare metal instance family ebmgn7ix

Introduction:
- ebmgn7ix is a new Elastic Bare Metal Server instance type family from Alibaba Cloud, launched to support the rapid growth of AI-generated workloads. Each instance is a bare metal host equipped with eight NVIDIA A10 GPUs.
- Powered by the latest CIPU 1.0 cloud processor, this instance type family decouples compute and storage, allowing you to flexibly select storage resources. Compared to the previous generation, the inter-instance bandwidth is increased to 160 Gbit/s, enabling faster data transfer and processing for small-scale, multi-machine training workloads.
- This instance type family provides bare metal capabilities. Unlike traditional virtualized instances, it supports peer-to-peer (P2P) communication between GPU instances, significantly improving multi-GPU computing efficiency.
Use cases:
- Use GRID images from Alibaba Cloud Marketplace to activate the graphics capabilities of A10 GPUs. This provides efficient graphics processing for animation, film and television VFX, and rendering.
- You can combine this instance type family with ACK for container management to efficiently and cost-effectively support AIGC graphics generation and LLM inference (up to 130 billion parameters).
- Other general AI workloads, such as image and speech recognition.
Compute:
- Powered by NVIDIA A10 GPUs:
  - Innovative Ampere architecture.
  - Supports common acceleration features such as vGPU, RTX, and TensorRT.
- Processor: Intel^® Xeon^® Scalable processors (Ice Lake) with a base frequency of 2.9 GHz and an all-core turbo frequency of 3.5 GHz.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information about cloud disks, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers ultra-high network performance with a packet forwarding rate of up to 24 million PPS.
- Supports Elastic RDMA Interface (ERI). This interface enables RDMA pass-through for accelerated interconnectivity within a VPC network, increasing bandwidth to 160 Gbit/s.
  Note
  To use ERI, see Enable on enterprise-level instance or Enable eRDMA on GPU instances.

The following table describes the instance types and specifications of the ebmgn7ix instance type family.

Instance type	vCPU	Memory (GiB)	GPU	Network bandwidth (Gbit/s)	Packet rate (PPS)	Private IPv4 addresses	IPv6 addresses	Multi-queue	ENIs
ecs.ebmgn7ix.32xlarge	128	512	NVIDIA A10 × 8	160	24,000,000	30	30	32/32	16

Note

Launch ebmgn7ix instances from images that use the UEFI boot mode. If you use a custom image, ensure that it supports UEFI boot mode and that the boot mode property of the image is set to UEFI. For more information, see Instance boot mode.

GPU-accelerated ebmgn7i instance family

Introduction: The ebmgn7i is an instance family built on the X-Dragon architecture. These instances provide software-defined hardware computing, delivering both elasticity and powerful performance.
Use cases:
- Equipped with high-performance CPUs, memory, and GPUs, these instances can handle high-volume concurrent AI inference workloads, making them ideal for image, speech, and behavior recognition.
- RTX support, combined with high-frequency CPUs, delivers high-performance 3D graphics virtualization ideal for intensive graphics processing workloads like remote graphic design and cloud gaming.
- RTX support, combined with high network and cloud disk bandwidth, makes these instances ideal for building high-performance rendering farms.
- With multiple GPUs and high network bandwidth, these instances support small-scale deep learning training workloads.
Compute:
- Powered by NVIDIA A10 GPUs:
  - Innovative Ampere architecture.
  - Supports common acceleration features such as vGPU, RTX, and TensorRT.
- Processor: Intel^® Xeon^® Scalable processors (Ice Lake) with a 2.9 GHz base frequency and a 3.5 GHz all-core turbo frequency.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information on block storage, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Delivers ultra-high network performance with up to 24 million network PPS.

The following table lists the instance types and specifications for the ebmgn7i instance family.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Network PPS	Multi-queue	Elastic network interfaces	Private IPv4 addresses	IPv6 addresses
ecs.ebmgn7i.32xlarge	128	768	NVIDIA A10 × 4	24 GB × 4	64	24 million	32	32	10	1

GPU-accelerated ebmgn7 instance family

Introduction: Based on the X-Dragon Architecture, the ebmgn7 instance type family features software-defined hardware, offering both flexibility and powerful computing performance.
Use cases:
- Deep learning training workloads, such as image classification, autonomous driving, and speech recognition.
- GPU-intensive scientific computing, such as computational fluid dynamics (CFD), computational finance, molecular dynamics, and environmental analysis.
Compute:
- Processor: 2.5 GHz Intel^® Xeon^® Platinum 8269CY (Cascade Lake) processors.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSD, ESSD AutoPL, and ESSD Zone-redundant. For more information about cloud disks, see the Block Storage Overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Network performance scales with the instance type.

The following table lists the instance types and specifications for the ebmgn7 instance type family.

Instance type	vCPU	Memory (GiB)	GPU memory	Baseline bandwidth (Gbit/s)	PPS	Multi-queue	ENIs	IPv4s per ENI	IPv6s per ENI
ecs.ebmgn7.26xlarge	104	768	40 GB × 8	30	18,000,000	16	15	10	1

The Multi-Instance GPU (MIG) feature may be either enabled or disabled by default when an ebmgn7 instance starts. You must check its status and enable or disable it as needed. For more information about MIG, see the NVIDIA Multi-Instance GPU User Guide.

The following table describes MIG support for ebmgn7 instances.

Instance type	MIG supported	Description
ecs.ebmgn7.26xlarge	Yes	These instances support the MIG feature.

Ebmgn6e GPU-accelerated instance family

Introduction:
- The ebmgn6e instance family, built on the X-Dragon architecture, delivers software-defined hardware computing that combines flexibility, elasticity, and powerful performance.
- These instances are powered by NVIDIA V100 (32 GB NVLink) GPU compute cards.
- The GPU accelerator is the V100 (SXM2 package), which has the following features:
  - Innovative Volta architecture.
  - 32 GB of HBM2 gpu memory per GPU (with a gpu memory bandwidth of 900 GB/s).
  - 5,120 CUDA Cores per GPU.
  - 640 Tensor Cores per GPU.
  - Each GPU supports six bidirectional NVLink links. Each unidirectional link provides 25 GB/s of bandwidth, totaling 300 GB/s (6 × 25 × 2).
Use cases:
- Deep learning, such as training and inference for AI algorithms like image classification, autonomous driving, and speech recognition.
- Scientific computing, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis.
Compute:
- vCPU-to-memory ratio of 1:8.
- Processors: Intel^® Xeon^® Platinum 8163 (Skylake) with a base frequency of 2.5 GHz.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSDs, ESSD AutoPLs, Zone-redundant ESSDs, SSD cloud disks, and ultra disks. For more information, see Block storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Network performance scales with the instance type.

The following table lists the instance types and specifications for the ebmgn6e instance family.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Baseline bandwidth (Gbit/s)	PPS	Multi-queue	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.ebmgn6e.24xlarge	96	768	NVIDIA V100 × 8	32 GB × 8	32	4,800,000	16	15	10	1

Ebmgn6v instance family

Instance family description:
- The ebmgn6v instance family is built on the X-Dragon architecture. These instances offer high-performance, software-defined hardware computing with cloud elasticity.
- Powered by NVIDIA V100 GPUs.
- The V100 (SXM2 package) GPU accelerator features the following:
  - Innovative Volta architecture
  - 16 GB of HBM2 GPU memory per GPU, with a memory bandwidth of 900 GB/s
  - 5,120 CUDA Cores per GPU
  - 640 Tensor Cores per GPU
  - Each GPU supports six NVLink bidirectional links, with a unidirectional bandwidth of 25 GB/s per link, for a total bandwidth of 300 GB/s (6 × 25 × 2 = 300 GB/s)
Use cases:
- Deep learning training and inference workloads, such as image classification, autonomous driving, and speech recognition
- Scientific computing workloads, such as computational fluid dynamics, computational finance, molecular dynamics, and environmental analysis
Compute:
- vCPU-to-memory ratio of 1:4.
- Processor: Intel ^® Xeon ^® Platinum 8163 (Skylake) with a 2.5 GHz base frequency.
Storage:
- These are I/O optimized instances.
- Supported cloud disk types: ESSD, ESSD AutoPL, Zone-redundant ESSD, standard SSD, and Ultra Disk. For more information, see Block Storage overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Network performance scales with the instance type.

The following table lists the instance types and specifications for the ebmgn6v instance family.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Network PPS	Multi-queue	ENIs	IPv4s per ENI	IPv6s per ENI
ecs.ebmgn6v.24xlarge	96	384	NVIDIA V100 × 8	16 GB × 8	30	4,500,000	8	32	10	1

GPU compute instance family ebmgn6i

Introduction
- ebmgn6i is an instance type family that is built on the X-Dragon Architecture, delivers software-defined hardware computing, and combines flexible elasticity with powerful performance.
- These instances are powered by NVIDIA T4 GPU accelerators with the following features:
  - Innovative Turing architecture
  - 16 GB of GPU memory per GPU with a memory bandwidth of 320 GB/s
  - 2,560 CUDA Cores per GPU
  - Up to 320 Turing Tensor Cores per GPU
  - Variable-precision Tensor Cores deliver 65 TFLOPS of FP16 performance, 130 INT8 TOPS, and 260 INT4 TOPS.
Use cases
- AI (DL/ML) inference, including applications such as computer vision, speech recognition, speech synthesis, NLP, machine translation, and recommendation systems.
- Real-time cloud rendering for cloud gaming.
- Real-time cloud rendering for AR/VR.
- Compute-intensive graphics workloads or graphics workstations.
- GPU-accelerated databases.
- High-performance computing.
Compute
- vCPU-to-memory ratio of 1:4.
- Processor: Intel^® Xeon^® Platinum 8163 (Skylake) with a base frequency of 2.5 GHz.
Storage
- These are I/O optimized instances.
- Supported cloud disk types: ESSD cloud disk, ESSD AutoPL cloud disk, ESSD Zone-redundant cloud disk, SSD cloud disk, and Ultra Disk. For more information, see Block Storage Overview.
Network
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Network performance scales with the instance type.

The following table lists the instance types and specifications for the ebmgn6i instance family.

Instance type	vCPU	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Forwarding rate (PPS)	Multiqueue	ENIs	Private IPv4 addresses	IPv6 addresses
ecs.ebmgn6i.24xlarge	96	384	NVIDIA T4 × 4	16 GB × 4	30	4.5 million	8	32	10	1

sccgn7ex, GPU-accelerated compute-optimized SCC instance family

Introduction: The sccgn7ex family provides high-bandwidth SCC instances. Alibaba Cloud developed this family to meet the growing demand for large-scale AI training. Multiple bare metal servers are interconnected through a third-generation RDMA SCC network. This network supports an interconnection bandwidth of 800 Gbit/s. You can scale the number of clusters based on your training needs to quickly meet the demands of large-scale AI parameter training.
Scenarios: Ultra-large-scale AI training.
Compute:
- These instances support NVSwitch and provide up to 312 TFLOPS of computing power (TF32).
- The processor-to-memory ratio is 1:8.
- Processor: Third-generation Intel^® Xeon^® 8369 scalable processors (Ice Lake). These processors have a base frequency of 2.9 GHz, an all-core turbo frequency of 3.5 GHz, and support the PCIe 4.0 interface.
Storage:
- I/O optimized instance
- Supported disk types: ESSDs, ESSD AutoPL disks, and Regional ESSDs. For more information, see Elastic Block Storage Overview.
Network:
- These instances support IPv4 and IPv6. For information about IPv6 communication, see IPv6 communication.
- Supports only VPC.
- High network performance with a packet forwarding rate of 24 million PPS.
- sccgn7ex instances support an interconnection bandwidth of 800 Gbit/s (4 × dual-port 100 Gbit/s RDMA). The instances support GPUDirect. Each GPU is directly connected to a 100 Gbit/s network port.

The following table lists the instance types and specifications for the sccgn7ex family.

Instance type	vCPU	Memory (GiB)	GPU memory (GB)	Base network bandwidth (Gbit/s)	Packet forwarding rate (PPS)	RoCE network (Gbit/s)	Elastic Network Interfaces (ENIs)
ecs.sccgn7ex.32xlarge	128	1024	80 GB × 8	64	24 million	800	15

Type	Related links
GPU-accelerated compute-optimized (gn series)	gn9gc, GPU-accelerated compute-optimized instance family instance family gn8v/gn8v-tee instance family gn8is instance family gn7e instance family gn7i instance family gn7 instance family gn7r instance family gn6i instance family gn6e instance family gn6v
ECS Bare Metal Instance	ECS Bare Metal Instance family ebmgn9g ECS Bare Metal Instance family ebmgn9ge ECS Bare Metal Instance family ebmgn9gc ECS Bare Metal Instance family ebmgn8v ECS Bare Metal Instance family ebmgn8ia ECS Bare Metal Instance family ebmgn8is ECS Bare Metal Instance family ebmgn7ex ECS Bare Metal Instance family ebmgn7e ECS Bare Metal Instance family ebmgn7ix ECS Bare Metal Instance family ebmgn7i ECS Bare Metal Instance family ebmgn7 ECS Bare Metal Instance family ebmgn6e ECS Bare Metal Instance family ebmgn6v ECS Bare Metal Instance family ebmgn6i
Super Computing Cluster (SCC)	Super Computing Cluster (SCC) instance family sccgn7ex
Not recommended (If this instance family is unavailable, we recommend one of the families listed above.)	instance family gn7s