An Elastic High Performance Computing (E-HPC) cluster is a group of Elastic Compute Service (ECS) instances that deliver high-performance computing capabilities. Compared with typical ECS instances, E-HPC clusters are more powerful, reliable, and also provide advantages in scalability and availability.
Cluster types
E-HPC offers two cluster types on Alibaba Cloud public cloud. Both use ECS instances as compute nodes.
| Edition | How it works |
|---|---|
| Standard Edition | E-HPC provisions the cluster and installs the scheduler (Slurm or OpenPBS) and service components. You maintain service availability. |
| Managed Edition | E-HPC provisions the cluster and takes over management node operations. The scheduler is Slurm. |
Choose a cluster type
| If you want to... | Use |
|---|---|
| Build an HPC cluster from scratch | Standard Edition |
| Have E-HPC manage the nodes | Managed Edition |
Manage clusters
After creating a cluster, use the following capabilities to operate and scale it.
Cluster configuration — View and update cluster settings, installed software, custom services, and shared storage:
Resource scaling — Add or remove nodes, set auto scaling policies, and define preset node pools to reduce job wait times:
Queue management — Create and configure job queues to control resource allocation across workloads:
User management — Add cluster users and configure access permissions:
Job scheduling — Submit and monitor jobs, or stop a running job: