Elastic GPU Service is a GPU-accelerated computing service that lets you use and scale GPU computing resources on demand. As a part of the Alibaba Cloud elastic computing family, it combines the power of both GPUs and CPUs. This combination meets your needs in scenarios such as artificial intelligence, high-performance computing, and professional graphics and image editing. For example, Elastic GPU Service can significantly improve the computational efficiency of parallel computing.
Why choose Elastic GPU Service
Alibaba Cloud Elastic GPU Service is a compute server that supports both GPU and CPU applications. GPUs have unique advantages in performing complex mathematical and geometric calculations. They can provide hundreds of times more computing power than CPUs, especially for floating-point and parallel operations. The features of GPUs are as follows:
They have many arithmetic logic units (ALUs) that excel at handling large-scale concurrent computing.
They support high-throughput, multi-threaded parallel operations.
Their logic control units are relatively simple.
The following table compares Elastic GPU Service with self-managed GPU servers.
Comparison Item | Elastic GPU Service | Self-managed GPU server |
Flexibility |
|
|
Ease of use |
|
|
Disaster recovery and backup |
|
|
Security |
|
|
Cost |
|
|
GPU-accelerated instance families
An instance is the smallest unit that provides computing services for your business. Different instance types offer different computing capabilities. ECS instances are categorized into various instance families based on business and usage scenarios. GPU-accelerated instances are a type of ECS instance. They provide GPU acceleration while offering the same user experience as regular ECS instances. When you create an ECS instance, you can select a GPU instance type from the enterprise-level heterogeneous computing instance family, ECS Bare Metal Instance family, or Super Computing Cluster (SCC) instance family.
For more information about GPU instance types, see Instance families.
Benefits
-
Broad Coverage
Alibaba Cloud Elastic GPU Service is deployed across multiple regions worldwide. Combined with auto provisioning and Auto Scaling, it effectively meets your business’s burst demands.
-
Powerful Computing Capabilities
Alibaba Cloud Elastic GPU Service uses industry-leading GPU cards and high-performance CPU platforms. A single instance delivers up to 1000 TFLOPS of mixed precision computing performance.
-
Excellent Network Performance
Elastic GPU Service instances use the VPC network, which supports up to 4.5 million PPS and 32 Gbit/s of internal bandwidth. Within Super Computing Cluster (SCC) products, nodes provide an additional RDMA network of up to 50 Gbit/s, meeting the low-latency, high-bandwidth requirements for inter-node data transmission.
-
Flexible Purchasing Options
It supports subscription, pay-as-you-go, spot instances, reserved instances, and Storage Capacity Units. You can purchase resources as needed to avoid waste.
Alibaba Cloud also provides the DeepGPU toolkit to use with Elastic GPU Service. The DeepGPU toolkit enhances GPU computing services and helps you use GPU resources on Alibaba Cloud more easily and efficiently. For more information, see Benefits of DeepGPU.
Billing
The billing of Elastic GPU Service is the same as that of Elastic Compute Service (ECS). Billing applies to resources such as compute resources (vCPUs, memory, and GPUs), images, block storage, public bandwidth, and snapshots.
Common billing methods are as follows:
Subscription: Purchase resources for a specific period. You pay before you use the resources.
Pay-as-you-go: Create and release resources on demand. You pay after you use the resources.
Spot instance: Bid for idle compute resources. Spot instances are available at a discount compared to pay-as-you-go instances, but they can be reclaimed by the system.
Reserved instance: A voucher that you can use with pay-as-you-go instances. You commit to using an instance with a specific configuration, including instance type, region, and zone. In return, you receive a discount on your compute resource bills.
Savings plan: A discount plan that you can use with pay-as-you-go instances. You commit to a consistent amount of resource usage, measured in CNY per hour. In return, you receive a discount on your bills for resources such as compute resources and system disks.
Storage Capacity Unit: A resource plan that you can use with pay-as-you-go storage products. You commit to a specific amount of storage capacity. In return, you receive a discount on your bills for resources such as Elastic Block Storage, NAS, and OSS.
For more information about Elastic GPU Service billing, see Elastic GPU Service billing.
Related toolkits
Alibaba Cloud provides the DeepGPU toolkit to help you use GPU resources more efficiently. The main components of the DeepGPU toolkit include the following tools:
For more information about the DeepGPU toolkit, see What is DeepGPU?.
Tool Name | Description |
An AI accelerator developed by Alibaba Cloud. It provides training and inference acceleration for generative AI and Large Language Model (LLM) scenarios. | |
An AI communication acceleration library developed by Alibaba Cloud for multi-GPU interconnection. It improves communication efficiency in AI distributed training or multi-card inference tasks. | |
A large language model (LLM) inference engine developed by Alibaba Cloud. It provides high-performance LLM inference services for LLM tasks. | |
A container sharing technology from Alibaba Cloud that is based on kernel-level virtual GPU isolation. It isolates GPU resources to allow multiple containers to share a single GPU card. | |
A tool from Alibaba Cloud for building artificial intelligence (AI) computing tasks. It provides convenient interfaces and command lines for you to build AI computing tasks on Alibaba Cloud IaaS resources. |