Cluster overview

更新时间:
复制 MD 格式

An Elastic High Performance Computing (E-HPC) cluster is a group of Elastic Compute Service (ECS) instances that deliver high-performance computing capabilities. Compared with typical ECS instances, E-HPC clusters are more powerful, reliable, and also provide advantages in scalability and availability.

Cluster types

E-HPC offers two cluster types on Alibaba Cloud public cloud. Both use ECS instances as compute nodes.

EditionHow it works
Standard EditionE-HPC provisions the cluster and installs the scheduler (Slurm or OpenPBS) and service components. You maintain service availability.
    Managed EditionE-HPC provisions the cluster and takes over management node operations. The scheduler is Slurm.

    Choose a cluster type

    If you want to...Use
    Build an HPC cluster from scratchStandard Edition
    Have E-HPC manage the nodesManaged Edition

      Manage clusters

      After creating a cluster, use the following capabilities to operate and scale it.

      Cluster configuration — View and update cluster settings, installed software, custom services, and shared storage:

      Resource scaling — Add or remove nodes, set auto scaling policies, and define preset node pools to reduce job wait times:

      Queue management — Create and configure job queues to control resource allocation across workloads:

      User management — Add cluster users and configure access permissions:

      Job scheduling — Submit and monitor jobs, or stop a running job: