Features and specifications of vGPU-accelerated instance families-Elastic GPU Service(EGS)-阿里云帮助中心

This page describes the features and instance types of the following vGPU-accelerated instance families in Elastic GPU Service (EGS):

sgn8ia: Latest-generation vGPU instances powered by NVIDIA Lovelace GPUs and AMD Genoa processors
sgn7i-vws: Cost-efficient vGPU instances with shared CPUs and NVIDIA A10 GPUs
vgn7i-vws: Dedicated-CPU vGPU instances with NVIDIA A10 GPUs
vgn6i-vws: Previous-generation vGPU instances with NVIDIA T4 GPUs (upgraded from vgn6i)

All families run on the third-generation SHENLONG architecture with fast path acceleration, delivering storage, network, and compute stability improvements of an order of magnitude over traditional virtualization. Each family includes an NVIDIA GRID virtual workstation (vWS) license, providing certified graphics acceleration for Computer Aided Design (CAD) software and professional graphics applications.

Family comparison

Family	GPU	CPU	CPU allocation	vGPU range	Storage
sgn8ia	NVIDIA Lovelace	AMD Genoa, 3.4–3.75 GHz	Shared (~1:1.5 overcommit)	2 GB to 48 GB GPU memory	ESSDs, ESSD AutoPL disks
sgn7i-vws	NVIDIA A10 (Ampere)	Intel Xeon Ice Lake, 2.9/3.5 GHz	Shared	1/12 to 1/3 of A10 (2–8 GB GPU memory)	ESSDs, ESSD AutoPL disks
vgn7i-vws	NVIDIA A10 (Ampere)	Intel Xeon Ice Lake, 2.9/3.5 GHz	Dedicated	1/6 to full A10 (4–24 GB GPU memory)	ESSDs, ESSD AutoPL disks
vgn6i-vws	NVIDIA T4	Intel Xeon Platinum 8163 (Skylake), 2.5 GHz	Dedicated	1/4 to full T4 (4–16 GB GPU memory)	Standard SSDs, ultra disks

How vGPU slicing works

Each physical GPU is sliced into multiple GPU partitions. Each partition is allocated as a vGPU to a single instance. The GPUs column in the instance type tables uses the format <GPU model> * <fraction> to show both the GPU model and the partition size allocated to each instance. For example, NVIDIA A10 * 1/6 means the instance receives one-sixth of an NVIDIA A10 GPU.

Use cases

Use case	Description	Recommended families
Remote graphics and virtual workstations	Graphic design, CAD, animation, film production, mechanical design — accessed remotely with near-native GPU performance	sgn8ia, sgn7i-vws, vgn7i-vws
AI inference at scale	Concurrent inference for image recognition, speech recognition, and behavior identification	sgn8ia, sgn7i-vws, vgn7i-vws
Cloud gaming	Real-time GPU rendering for interactive cloud gaming and AR/VR applications	All families
3D visualization	Professional-grade GPU rendering for graphics-intensive workloads	sgn8ia, sgn7i-vws, vgn7i-vws
Deep learning environments	Educational and experimental deep learning environments requiring GPU acceleration	vgn6i-vws

sgn8ia

sgn8ia instances use NVIDIA Lovelace GPUs with large GPU memory and multiple GPU slicing options, paired with AMD Genoa processors running at 3.4 GHz to 3.75 GHz. CPUs are shared resources with an overcommit ratio of approximately 1:1.5. Memory and GPU memory are exclusive to each instance. Available GPU memory ranges from 2 GB to 48 GB (full GPU).

For workloads that require dedicated CPUs, use gn7i GPU-accelerated compute-optimized instances instead.

GPU: NVIDIA Lovelace — supports vGPU, RTX, and TensorRT

CPU: AMD Genoa, 3.4 GHz to 3.75 GHz (shared, ~1:1.5 overcommit)

Storage: I/O optimized; supports Enterprise SSDs (ESSDs) and ESSD AutoPL disks

Network: Supports IPv4 and IPv6

Use cases:

Concurrent AI inference — image recognition, speech recognition, behavior identification
Compute-intensive graphics processing — remote graphic design, cloud gaming
3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance type	vCPUs	Memory (GiB)	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4/IPv6 addresses per ENI	Maximum disks	Disk baseline IOPS	Disk baseline BPS (MB/s)
ecs.sgn8ia-m2.xlarge	4	16	2 GB	2.5	1,000,000	4	4	15/15	9	30,000	244
ecs.sgn8ia-m4.2xlarge	8	32	4 GB	4	1,600,000	8	4	15/15	9	45,000	305
ecs.sgn8ia-m8.4xlarge	16	64	8 GB	7	2,000,000	16	8	30/30	17	60,000	427
ecs.sgn8ia-m16.8xlarge	32	128	16 GB	10	3,000,000	32	8	30/30	33	80,000	610
ecs.sgn8ia-m24.12xlarge	48	192	24 GB	16	4,500,000	48	8	30/30	33	120,000	1,000
ecs.sgn8ia-m48.24xlarge	96	384	48 GB	32	9,000,000	64	15	30/30	33	24,000	2,000

GPU memory values represent vGPU memory allocated using vGPU slicing technology. CPUs are shared with an overcommit ratio of approximately 1:1.5. Memory and GPU memory are exclusive to each instance.

sgn7i-vws

sgn7i-vws instances use NVIDIA A10 GPUs (NVIDIA Ampere architecture) paired with Intel Xeon Scalable processors (Ice Lake, 2.9 GHz base, 3.5 GHz all-core turbo). CPU and network resources are shared to maximize utilization. Memory and GPU memory are exclusive to each instance. Available GPU memory ranges from 1/12 of an A10 GPU (2 GB) to 1/3 of an A10 GPU (8 GB).

For workloads that require dedicated CPUs, use vgn7i-vws instead.

GPU: NVIDIA A10 (Ampere architecture) — supports vGPU, RTX, and TensorRT

CPU: Intel Xeon Scalable (Ice Lake), 2.9 GHz base / 3.5 GHz all-core turbo (shared)

Storage: I/O optimized; supports ESSDs and ESSD AutoPL disks

Network: Supports IPv4 and IPv6 (shared)

Use cases:

Concurrent AI inference — image recognition, speech recognition, behavior identification
Compute-intensive graphics processing — remote graphic design, cloud gaming
3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.sgn7i-vws-m2.xlarge	4	15.5	NVIDIA A10 * 1/12	24GB * 1/12	1.5/5	500,000	4	2	2	1
ecs.sgn7i-vws-m4.2xlarge	8	31	NVIDIA A10 * 1/6	24GB * 1/6	2.5/10	1,000,000	4	4	6	1
ecs.sgn7i-vws-m8.4xlarge	16	62	NVIDIA A10 * 1/3	24GB * 1/3	5/20	2,000,000	8	4	10	1
ecs.sgn7i-vws-m2s.xlarge	4	8	NVIDIA A10 * 1/12	24GB * 1/12	1.5/5	500,000	4	2	2	1
ecs.sgn7i-vws-m4s.2xlarge	8	16	NVIDIA A10 * 1/6	24GB * 1/6	2.5/10	1,000,000	4	4	6	1
ecs.sgn7i-vws-m8s.4xlarge	16	32	NVIDIA A10 * 1/3	24GB * 1/3	5/20	2,000,000	8	4	10	1

The GPUs column shows the GPU model and partition allocated per instance. NVIDIA A10 * 1/12 means each instance receives one-twelfth of an NVIDIA A10 GPU as a vGPU. CPU and network resources are shared; memory and GPU memory are exclusive to each instance.

vgn7i-vws

vgn7i-vws instances use NVIDIA A10 GPUs (NVIDIA Ampere architecture) paired with Intel Xeon Scalable processors (Ice Lake, 2.9 GHz base, 3.5 GHz all-core turbo). Unlike sgn7i-vws, CPU resources are dedicated to each instance. Available GPU memory ranges from 1/6 of an A10 GPU (4 GB) to a full A10 GPU (24 GB).

GPU: NVIDIA A10 (Ampere architecture) — supports vGPU, RTX, and TensorRT

CPU: Intel Xeon Scalable (Ice Lake), 2.9 GHz base / 3.5 GHz all-core turbo (dedicated)

Storage: I/O optimized; supports ESSDs and ESSD AutoPL disks

Network: Supports IPv4 and IPv6

Use cases:

Concurrent AI inference — image recognition, speech recognition, behavior identification
Compute-intensive graphics processing — remote graphic design, cloud gaming
3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.vgn7i-vws-m4.xlarge	4	30	NVIDIA A10 * 1/6	24GB * 1/6	3	1,000,000	4	4	10	1
ecs.vgn7i-vws-m8.2xlarge	10	62	NVIDIA A10 * 1/3	24GB * 1/3	5	2,000,000	8	6	10	1
ecs.vgn7i-vws-m12.3xlarge	14	93	NVIDIA A10 * 1/2	24GB * 1/2	8	3,000,000	8	6	15	1
ecs.vgn7i-vws-m24.7xlarge	30	186	NVIDIA A10 * 1	24GB * 1	16	6,000,000	12	8	30	1

The GPUs column shows the GPU model and partition allocated per instance. NVIDIA A10 * 1/6 means each instance receives one-sixth of an NVIDIA A10 GPU as a vGPU. CPU resources are dedicated; memory and GPU memory are exclusive to each instance.

vgn6i-vws

Important

vgn6i-vws is the upgraded version of vgn6i, updated to use the latest NVIDIA GRID driver with an NVIDIA GRID vWS license.

Free images with the driver pre-installed: Submit a ticket to request a pre-installed image.
Custom images without the driver: Submit a ticket to apply for the driver file. Alibaba Cloud does not charge additional license fees.

vgn6i-vws instances use NVIDIA T4 GPUs paired with Intel Xeon Platinum 8163 processors (Skylake, 2.5 GHz). The CPU-to-memory ratio is 1:5. Supports 1/4 and 1/2 compute capacity of NVIDIA Tesla T4 GPUs, with 4 GB and 8 GB of GPU memory per vGPU instance.

GPU: NVIDIA T4 — supports 1/4 and 1/2 compute capacity, 4 GB and 8 GB GPU memory per vGPU instance

CPU: Intel Xeon Platinum 8163 (Skylake), 2.5 GHz

Storage: I/O optimized; supports standard SSDs and ultra disks

Network: Supports IPv4 and IPv6

Use cases:

Real-time rendering for cloud gaming
Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications
AI inference — deep learning and machine learning for elastic internet service deployment
Deep learning educational and experimental environments

Instance types

Instance type	vCPUs	Memory (GiB)	GPUs	GPU memory	Network baseline bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs	Private IPv4 addresses per ENI	IPv6 addresses per ENI
ecs.vgn6i-m4-vws.xlarge	4	23	NVIDIA T4 * 1/4	16GB * 1/4	2	500,000	4/2	3	10	1
ecs.vgn6i-m8-vws.2xlarge	10	46	NVIDIA T4 * 1/2	16GB * 1/2	4	800,000	8/2	4	10	1
ecs.vgn6i-m16-vws.5xlarge	20	92	NVIDIA T4 * 1	16GB * 1	7.5	1,200,000	6	4	10	1

The GPUs column shows the GPU model and partition allocated per instance. NVIDIA T4 * 1/4 means each instance receives one-quarter of an NVIDIA T4 GPU as a vGPU.