What is Elastic GPU Service

更新时间:
复制 MD 格式

Elastic GPU Service is a GPU-accelerated computing service that lets you use and scale GPU computing resources on demand. As a part of the Alibaba Cloud elastic computing family, it combines the power of both GPUs and CPUs. This combination meets your needs in scenarios such as artificial intelligence, high-performance computing, and professional graphics and image editing. For example, Elastic GPU Service can significantly improve the computational efficiency of parallel computing.

Why choose Elastic GPU Service

Alibaba Cloud Elastic GPU Service is a compute server that supports both GPU and CPU applications. GPUs have unique advantages in performing complex mathematical and geometric calculations. They can provide hundreds of times more computing power than CPUs, especially for floating-point and parallel operations. The features of GPUs are as follows:

  • They have many arithmetic logic units (ALUs) that excel at handling large-scale concurrent computing.

  • They support high-throughput, multi-threaded parallel operations.

  • Their logic control units are relatively simple.

The following table compares Elastic GPU Service with self-managed GPU servers.

Comparison Item

Elastic GPU Service

Self-managed GPU server

Flexibility

  • You can quickly create one or more Elastic GPU Service instances.

  • You can flexibly change instance types (vCPU, memory, and GPU). Online upgrades and downgrades are supported.

  • You can freely upgrade or downgrade bandwidth.

  • Long server purchase cycles.

  • Fixed server specifications that cannot be changed flexibly.

  • Bandwidth is purchased upfront and cannot be changed.

Ease of use

  • Simple and convenient web-based management.

  • Pre-installed with mainstream operating systems. Windows is genuinely activated. You can change the operating system online.

  • You can install the GPU driver during purchase for convenience.

  • No online management tools, which makes maintenance difficult.

  • You must provide, install, and change the operating system yourself.

  • You must purchase and install the GPU driver yourself.

Disaster recovery and backup

  • Triplicate data design. If one copy is corrupted, it can be quickly recovered.

  • Automatic and fast recovery from hardware failures.

  • You must build the setup yourself using ordinary, expensive storage devices.

  • You must repair any data corruption.

Security

  • Effectively prevents MAC spoofing and ARP attacks.

  • Protects against DDoS attacks. It scrubs traffic and can null-route traffic.

  • Provides additional services such as port intrusion scanning, trojan scanning, and vulnerability scanning.

  • Difficult to prevent MAC spoofing and ARP attacks.

  • Traffic scrubbing and null-routing devices must be purchased separately and are expensive.

  • Commonly vulnerable to trojans and port scanning.

Cost

  • Supports subscription and pay-as-you-go billing methods. You can choose the method that best suits your business needs.

  • You can purchase resources as needed without a large upfront investment.

  • Cannot be purchased on demand. You must provision for peak business loads.

  • Requires an enormous initial investment, leading to significant waste from idle resources.

GPU-accelerated instance families

An instance is the smallest unit that provides computing services for your business. Different instance types offer different computing capabilities. ECS instances are categorized into various instance families based on business and usage scenarios. GPU-accelerated instances are a type of ECS instance. They provide GPU acceleration while offering the same user experience as regular ECS instances. When you create an ECS instance, you can select a GPU instance type from the enterprise-level heterogeneous computing instance family, ECS Bare Metal Instance family, or Super Computing Cluster (SCC) instance family.

For more information about GPU instance types, see Instance families.

Benefits

  • Broad Coverage

    Alibaba Cloud Elastic GPU Service is deployed across multiple regions worldwide. Combined with auto provisioning and Auto Scaling, it effectively meets your business’s burst demands.

  • Powerful Computing Capabilities

    Alibaba Cloud Elastic GPU Service uses industry-leading GPU cards and high-performance CPU platforms. A single instance delivers up to 1000 TFLOPS of mixed precision computing performance.

  • Excellent Network Performance

    Elastic GPU Service instances use the VPC network, which supports up to 4.5 million PPS and 32 Gbit/s of internal bandwidth. Within Super Computing Cluster (SCC) products, nodes provide an additional RDMA network of up to 50 Gbit/s, meeting the low-latency, high-bandwidth requirements for inter-node data transmission.

  • Flexible Purchasing Options

    It supports subscription, pay-as-you-go, spot instances, reserved instances, and Storage Capacity Units. You can purchase resources as needed to avoid waste.

Alibaba Cloud also provides the DeepGPU toolkit to use with Elastic GPU Service. The DeepGPU toolkit enhances GPU computing services and helps you use GPU resources on Alibaba Cloud more easily and efficiently. For more information, see Benefits of DeepGPU.

Billing

The billing of Elastic GPU Service is the same as that of Elastic Compute Service (ECS). Billing applies to resources such as compute resources (vCPUs, memory, and GPUs), images, block storage, public bandwidth, and snapshots.

Common billing methods are as follows:

  • Subscription: Purchase resources for a specific period. You pay before you use the resources.

  • Pay-as-you-go: Create and release resources on demand. You pay after you use the resources.

  • Spot instance: Bid for idle compute resources. Spot instances are available at a discount compared to pay-as-you-go instances, but they can be reclaimed by the system.

  • Reserved instance: A voucher that you can use with pay-as-you-go instances. You commit to using an instance with a specific configuration, including instance type, region, and zone. In return, you receive a discount on your compute resource bills.

  • Savings plan: A discount plan that you can use with pay-as-you-go instances. You commit to a consistent amount of resource usage, measured in CNY per hour. In return, you receive a discount on your bills for resources such as compute resources and system disks.

  • Storage Capacity Unit: A resource plan that you can use with pay-as-you-go storage products. You commit to a specific amount of storage capacity. In return, you receive a discount on your bills for resources such as Elastic Block Storage, NAS, and OSS.

For more information about Elastic GPU Service billing, see Elastic GPU Service billing.

Related toolkits

Alibaba Cloud provides the DeepGPU toolkit to help you use GPU resources more efficiently. The main components of the DeepGPU toolkit include the following tools:

Note

For more information about the DeepGPU toolkit, see What is DeepGPU?.

Tool Name

Description

AI Accelerator Deepytorch

An AI accelerator developed by Alibaba Cloud. It provides training and inference acceleration for generative AI and Large Language Model (LLM) scenarios.

AI Communication Acceleration Library DeepNCCL

An AI communication acceleration library developed by Alibaba Cloud for multi-GPU interconnection. It improves communication efficiency in AI distributed training or multi-card inference tasks.

What is the DeepGPU-LLM inference engine?

A large language model (LLM) inference engine developed by Alibaba Cloud. It provides high-performance LLM inference services for LLM tasks.

GPU Container Sharing Technology cGPU

A container sharing technology from Alibaba Cloud that is based on kernel-level virtual GPU isolation. It isolates GPU resources to allow multiple containers to share a single GPU card.

Cluster Rapid Deployment Tool FastGPU

A tool from Alibaba Cloud for building artificial intelligence (AI) computing tasks. It provides convenient interfaces and command lines for you to build AI computing tasks on Alibaba Cloud IaaS resources.