vGPU-accelerated instance families (vgn and sgn series)

更新时间:
复制 MD 格式

This page describes the features and instance types of the following vGPU-accelerated instance families in Elastic GPU Service (EGS):

  • sgn8ia: Latest-generation vGPU instances powered by NVIDIA Lovelace GPUs and AMD Genoa processors

  • sgn7i-vws: Cost-efficient vGPU instances with shared CPUs and NVIDIA A10 GPUs

  • vgn7i-vws: Dedicated-CPU vGPU instances with NVIDIA A10 GPUs

  • vgn6i-vws: Previous-generation vGPU instances with NVIDIA T4 GPUs (upgraded from vgn6i)

All families run on the third-generation SHENLONG architecture with fast path acceleration, delivering storage, network, and compute stability improvements of an order of magnitude over traditional virtualization. Each family includes an NVIDIA GRID virtual workstation (vWS) license, providing certified graphics acceleration for Computer Aided Design (CAD) software and professional graphics applications.

Family comparison

FamilyGPUCPUCPU allocationvGPU rangeStorage
sgn8iaNVIDIA LovelaceAMD Genoa, 3.4–3.75 GHzShared (~1:1.5 overcommit)2 GB to 48 GB GPU memoryESSDs, ESSD AutoPL disks
sgn7i-vwsNVIDIA A10 (Ampere)Intel Xeon Ice Lake, 2.9/3.5 GHzShared1/12 to 1/3 of A10 (2–8 GB GPU memory)ESSDs, ESSD AutoPL disks
vgn7i-vwsNVIDIA A10 (Ampere)Intel Xeon Ice Lake, 2.9/3.5 GHzDedicated1/6 to full A10 (4–24 GB GPU memory)ESSDs, ESSD AutoPL disks
vgn6i-vwsNVIDIA T4Intel Xeon Platinum 8163 (Skylake), 2.5 GHzDedicated1/4 to full T4 (4–16 GB GPU memory)Standard SSDs, ultra disks

How vGPU slicing works

Each physical GPU is sliced into multiple GPU partitions. Each partition is allocated as a vGPU to a single instance. The GPUs column in the instance type tables uses the format <GPU model> * <fraction> to show both the GPU model and the partition size allocated to each instance. For example, NVIDIA A10 * 1/6 means the instance receives one-sixth of an NVIDIA A10 GPU.

Use cases

Use caseDescriptionRecommended families
Remote graphics and virtual workstationsGraphic design, CAD, animation, film production, mechanical design — accessed remotely with near-native GPU performancesgn8ia, sgn7i-vws, vgn7i-vws
AI inference at scaleConcurrent inference for image recognition, speech recognition, and behavior identificationsgn8ia, sgn7i-vws, vgn7i-vws
Cloud gamingReal-time GPU rendering for interactive cloud gaming and AR/VR applicationsAll families
3D visualizationProfessional-grade GPU rendering for graphics-intensive workloadssgn8ia, sgn7i-vws, vgn7i-vws
Deep learning environmentsEducational and experimental deep learning environments requiring GPU accelerationvgn6i-vws

sgn8ia

sgn8ia instances use NVIDIA Lovelace GPUs with large GPU memory and multiple GPU slicing options, paired with AMD Genoa processors running at 3.4 GHz to 3.75 GHz. CPUs are shared resources with an overcommit ratio of approximately 1:1.5. Memory and GPU memory are exclusive to each instance. Available GPU memory ranges from 2 GB to 48 GB (full GPU).

For workloads that require dedicated CPUs, use gn7i GPU-accelerated compute-optimized instances instead.

GPU: NVIDIA Lovelace — supports vGPU, RTX, and TensorRT

CPU: AMD Genoa, 3.4 GHz to 3.75 GHz (shared, ~1:1.5 overcommit)

Storage: I/O optimized; supports Enterprise SSDs (ESSDs) and ESSD AutoPL disks

Network: Supports IPv4 and IPv6

Use cases:

  • Concurrent AI inference — image recognition, speech recognition, behavior identification

  • Compute-intensive graphics processing — remote graphic design, cloud gaming

  • 3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance typevCPUsMemory (GiB)GPU memoryNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)NIC queuesENIsPrivate IPv4/IPv6 addresses per ENIMaximum disksDisk baseline IOPSDisk baseline BPS (MB/s)
ecs.sgn8ia-m2.xlarge4162 GB2.51,000,0004415/15930,000244
ecs.sgn8ia-m4.2xlarge8324 GB41,600,0008415/15945,000305
ecs.sgn8ia-m8.4xlarge16648 GB72,000,00016830/301760,000427
ecs.sgn8ia-m16.8xlarge3212816 GB103,000,00032830/303380,000610
ecs.sgn8ia-m24.12xlarge4819224 GB164,500,00048830/3033120,0001,000
ecs.sgn8ia-m48.24xlarge9638448 GB329,000,000641530/303324,0002,000
GPU memory values represent vGPU memory allocated using vGPU slicing technology. CPUs are shared with an overcommit ratio of approximately 1:1.5. Memory and GPU memory are exclusive to each instance.

sgn7i-vws

sgn7i-vws instances use NVIDIA A10 GPUs (NVIDIA Ampere architecture) paired with Intel Xeon Scalable processors (Ice Lake, 2.9 GHz base, 3.5 GHz all-core turbo). CPU and network resources are shared to maximize utilization. Memory and GPU memory are exclusive to each instance. Available GPU memory ranges from 1/12 of an A10 GPU (2 GB) to 1/3 of an A10 GPU (8 GB).

For workloads that require dedicated CPUs, use vgn7i-vws instead.

GPU: NVIDIA A10 (Ampere architecture) — supports vGPU, RTX, and TensorRT

CPU: Intel Xeon Scalable (Ice Lake), 2.9 GHz base / 3.5 GHz all-core turbo (shared)

Storage: I/O optimized; supports ESSDs and ESSD AutoPL disks

Network: Supports IPv4 and IPv6 (shared)

Use cases:

  • Concurrent AI inference — image recognition, speech recognition, behavior identification

  • Compute-intensive graphics processing — remote graphic design, cloud gaming

  • 3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance typevCPUsMemory (GiB)GPUsGPU memoryNetwork baseline/burst bandwidth (Gbit/s)Packet forwarding rate (pps)NIC queuesENIsPrivate IPv4 addresses per ENIIPv6 addresses per ENI
ecs.sgn7i-vws-m2.xlarge415.5NVIDIA A10 * 1/1224GB * 1/121.5/5500,0004221
ecs.sgn7i-vws-m4.2xlarge831NVIDIA A10 * 1/624GB * 1/62.5/101,000,0004461
ecs.sgn7i-vws-m8.4xlarge1662NVIDIA A10 * 1/324GB * 1/35/202,000,00084101
ecs.sgn7i-vws-m2s.xlarge48NVIDIA A10 * 1/1224GB * 1/121.5/5500,0004221
ecs.sgn7i-vws-m4s.2xlarge816NVIDIA A10 * 1/624GB * 1/62.5/101,000,0004461
ecs.sgn7i-vws-m8s.4xlarge1632NVIDIA A10 * 1/324GB * 1/35/202,000,00084101
The GPUs column shows the GPU model and partition allocated per instance. NVIDIA A10 * 1/12 means each instance receives one-twelfth of an NVIDIA A10 GPU as a vGPU. CPU and network resources are shared; memory and GPU memory are exclusive to each instance.

vgn7i-vws

vgn7i-vws instances use NVIDIA A10 GPUs (NVIDIA Ampere architecture) paired with Intel Xeon Scalable processors (Ice Lake, 2.9 GHz base, 3.5 GHz all-core turbo). Unlike sgn7i-vws, CPU resources are dedicated to each instance. Available GPU memory ranges from 1/6 of an A10 GPU (4 GB) to a full A10 GPU (24 GB).

GPU: NVIDIA A10 (Ampere architecture) — supports vGPU, RTX, and TensorRT

CPU: Intel Xeon Scalable (Ice Lake), 2.9 GHz base / 3.5 GHz all-core turbo (dedicated)

Storage: I/O optimized; supports ESSDs and ESSD AutoPL disks

Network: Supports IPv4 and IPv6

Use cases:

  • Concurrent AI inference — image recognition, speech recognition, behavior identification

  • Compute-intensive graphics processing — remote graphic design, cloud gaming

  • 3D modeling — animation, film production, cloud gaming, mechanical design

Instance types

Instance typevCPUsMemory (GiB)GPUsGPU memoryNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)NIC queuesENIsPrivate IPv4 addresses per ENIIPv6 addresses per ENI
ecs.vgn7i-vws-m4.xlarge430NVIDIA A10 * 1/624GB * 1/631,000,00044101
ecs.vgn7i-vws-m8.2xlarge1062NVIDIA A10 * 1/324GB * 1/352,000,00086101
ecs.vgn7i-vws-m12.3xlarge1493NVIDIA A10 * 1/224GB * 1/283,000,00086151
ecs.vgn7i-vws-m24.7xlarge30186NVIDIA A10 * 124GB * 1166,000,000128301
The GPUs column shows the GPU model and partition allocated per instance. NVIDIA A10 * 1/6 means each instance receives one-sixth of an NVIDIA A10 GPU as a vGPU. CPU resources are dedicated; memory and GPU memory are exclusive to each instance.

vgn6i-vws

Important

vgn6i-vws is the upgraded version of vgn6i, updated to use the latest NVIDIA GRID driver with an NVIDIA GRID vWS license.

  • Free images with the driver pre-installed: Submit a ticket to request a pre-installed image.

  • Custom images without the driver: Submit a ticket to apply for the driver file. Alibaba Cloud does not charge additional license fees.

vgn6i-vws instances use NVIDIA T4 GPUs paired with Intel Xeon Platinum 8163 processors (Skylake, 2.5 GHz). The CPU-to-memory ratio is 1:5. Supports 1/4 and 1/2 compute capacity of NVIDIA Tesla T4 GPUs, with 4 GB and 8 GB of GPU memory per vGPU instance.

GPU: NVIDIA T4 — supports 1/4 and 1/2 compute capacity, 4 GB and 8 GB GPU memory per vGPU instance

CPU: Intel Xeon Platinum 8163 (Skylake), 2.5 GHz

Storage: I/O optimized; supports standard SSDs and ultra disks

Network: Supports IPv4 and IPv6

Use cases:

  • Real-time rendering for cloud gaming

  • Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications

  • AI inference — deep learning and machine learning for elastic internet service deployment

  • Deep learning educational and experimental environments

Instance types

Instance typevCPUsMemory (GiB)GPUsGPU memoryNetwork baseline bandwidth (Gbit/s)Packet forwarding rate (pps)NIC queuesENIsPrivate IPv4 addresses per ENIIPv6 addresses per ENI
ecs.vgn6i-m4-vws.xlarge423NVIDIA T4 * 1/416GB * 1/42500,0004/23101
ecs.vgn6i-m8-vws.2xlarge1046NVIDIA T4 * 1/216GB * 1/24800,0008/24101
ecs.vgn6i-m16-vws.5xlarge2092NVIDIA T4 * 116GB * 17.51,200,00064101
The GPUs column shows the GPU model and partition allocated per instance. NVIDIA T4 * 1/4 means each instance receives one-quarter of an NVIDIA T4 GPU as a vGPU.