E-HPC and Workbench each depend on a specific set of network ports to function. This page lists the required security group rules for both, along with security recommendations and troubleshooting guidance.
E-HPC security group rules
E-HPC uses the following ports for cluster communication, job scheduling, file sharing, monitoring, and web access. Configure these rules in the security group attached to your cluster nodes.
| Service | Port/Protocol | Description | Security recommendation |
|---|---|---|---|
| SSH | 22/TCP | Remote access to E-HPC instances | Restrict source to specific IP addresses or CIDR blocks |
| NFS | 111/TCP, 2049/TCP, 20048/UDP, 32765–32768/UDP | Shared file system for data sharing between nodes | Restrict source to internal networks or trusted external networks |
| Slurm (HPC scheduler) | 6817–6819/TCP | Cluster management and job scheduling | Restrict to cluster nodes only |
| PBS (HPC scheduler) | 15001–15003/TCP, 17001 | Communication with PBS servers and job queue management | Restrict to within the cluster |
| Monitoring and logging (Prometheus, Fluentd) | 9090/TCP, 24224/TCP | Collecting and analyzing system performance data | Restrict to monitoring servers only |
| Web portal | 12011/TCP | Daily job management: job submission, job query, and data management | Confirm that the cluster's security group allows inbound access on this port |
Workbench security group rules
Workbench is a browser-based remote connection tool provided by Alibaba Cloud. It lets you connect to Elastic Compute Service (ECS) instances without installing additional software. For background on Workbench and its network requirements, see Security group settings related to Workbench.
The following table lists the ports required by tools available through Workbench.
| Service | Port/Protocol | Description | Security recommendation |
|---|---|---|---|
| RDP (Windows) | 3389 | Remote desktop access and control for Windows instances | Restrict to users who need remote access; review permissions regularly |
| Jupyter Notebook | 8888/TCP | Interactive data analysis and visualization | Restrict to specific IP addresses or CIDR blocks |
| VS Code Server | 3000/TCP | Remote code editing and development | Restrict to internal access or trusted external networks |
| Git | 22/TCP | Version control and code repository management | Restrict to authorized users to protect code integrity |
| Docker | 2375/TCP, 2376/TCP | Containerized application management and deployment | Restrict to internal access only |
Best practices
Apply least privilege. Open only the ports required for your workload. Start with the minimum set listed above and expand only when a specific service requires it.
Restrict SSH to known sources. Limit SSH (port 22) to specific IP addresses or CIDR blocks rather than allowing access from 0.0.0.0/0. This is the highest-impact rule to lock down on any internet-facing cluster node.
Scope scheduler ports to cluster nodes. Slurm (6817–6819) and PBS (15001–15003, 17001) ports should be reachable only from other nodes in the same cluster. Use security group references rather than CIDR blocks when possible — traffic from another security group is more precise than an IP range.
Isolate monitoring access. Restrict Prometheus (9090) and Fluentd (24224) to your monitoring server's IP or security group. These ports expose internal performance data and must not be internet-accessible.
Restrict Docker daemon ports. Docker ports 2375 and 2376 expose the Docker daemon API. Keep access strictly internal — exposure to the internet creates a critical security risk.
Review Workbench permissions regularly. RDP (3389) and VS Code Server (3000) grant direct access to running instances. Audit who has access to these ports on a scheduled basis and revoke permissions when no longer needed.
Troubleshooting
If a cluster service is unreachable, the most common cause is a missing or misconfigured security group rule. To verify connectivity on a specific port, run the following from a node that should have access:
# Test TCP connectivity to <target-ip> on <port>
nc -zv <target-ip> <port>
Replace <target-ip> with the IP address of the target node and <port> with the port number to test (for example, 6817 for Slurm).
If the connection times out or is refused, check:
-
The security group attached to the target instance has an inbound rule for that port.
-
The source IP or security group matches the rule's allowed source.
-
No network ACLs or firewall rules at the VPC level are blocking the traffic.
What's next
To add, modify, query, delete, import, or export security group rules, see Manage security group rules.