Security group

更新时间:
复制 MD 格式

E-HPC and Workbench each depend on a specific set of network ports to function. This page lists the required security group rules for both, along with security recommendations and troubleshooting guidance.

E-HPC security group rules

E-HPC uses the following ports for cluster communication, job scheduling, file sharing, monitoring, and web access. Configure these rules in the security group attached to your cluster nodes.

Service Port/Protocol Description Security recommendation
SSH 22/TCP Remote access to E-HPC instances Restrict source to specific IP addresses or CIDR blocks
NFS 111/TCP, 2049/TCP, 20048/UDP, 32765–32768/UDP Shared file system for data sharing between nodes Restrict source to internal networks or trusted external networks
Slurm (HPC scheduler) 6817–6819/TCP Cluster management and job scheduling Restrict to cluster nodes only
PBS (HPC scheduler) 15001–15003/TCP, 17001 Communication with PBS servers and job queue management Restrict to within the cluster
Monitoring and logging (Prometheus, Fluentd) 9090/TCP, 24224/TCP Collecting and analyzing system performance data Restrict to monitoring servers only
Web portal 12011/TCP Daily job management: job submission, job query, and data management Confirm that the cluster's security group allows inbound access on this port

Workbench security group rules

Workbench is a browser-based remote connection tool provided by Alibaba Cloud. It lets you connect to Elastic Compute Service (ECS) instances without installing additional software. For background on Workbench and its network requirements, see Security group settings related to Workbench.

The following table lists the ports required by tools available through Workbench.

Service Port/Protocol Description Security recommendation
RDP (Windows) 3389 Remote desktop access and control for Windows instances Restrict to users who need remote access; review permissions regularly
Jupyter Notebook 8888/TCP Interactive data analysis and visualization Restrict to specific IP addresses or CIDR blocks
VS Code Server 3000/TCP Remote code editing and development Restrict to internal access or trusted external networks
Git 22/TCP Version control and code repository management Restrict to authorized users to protect code integrity
Docker 2375/TCP, 2376/TCP Containerized application management and deployment Restrict to internal access only

Best practices

Apply least privilege. Open only the ports required for your workload. Start with the minimum set listed above and expand only when a specific service requires it.

Restrict SSH to known sources. Limit SSH (port 22) to specific IP addresses or CIDR blocks rather than allowing access from 0.0.0.0/0. This is the highest-impact rule to lock down on any internet-facing cluster node.

Scope scheduler ports to cluster nodes. Slurm (6817–6819) and PBS (15001–15003, 17001) ports should be reachable only from other nodes in the same cluster. Use security group references rather than CIDR blocks when possible — traffic from another security group is more precise than an IP range.

Isolate monitoring access. Restrict Prometheus (9090) and Fluentd (24224) to your monitoring server's IP or security group. These ports expose internal performance data and must not be internet-accessible.

Restrict Docker daemon ports. Docker ports 2375 and 2376 expose the Docker daemon API. Keep access strictly internal — exposure to the internet creates a critical security risk.

Review Workbench permissions regularly. RDP (3389) and VS Code Server (3000) grant direct access to running instances. Audit who has access to these ports on a scheduled basis and revoke permissions when no longer needed.

Troubleshooting

If a cluster service is unreachable, the most common cause is a missing or misconfigured security group rule. To verify connectivity on a specific port, run the following from a node that should have access:

# Test TCP connectivity to <target-ip> on <port>
nc -zv <target-ip> <port>

Replace <target-ip> with the IP address of the target node and <port> with the port number to test (for example, 6817 for Slurm).

If the connection times out or is refused, check:

  1. The security group attached to the target instance has an inbound rule for that port.

  2. The source IP or security group matches the rule's allowed source.

  3. No network ACLs or firewall rules at the VPC level are blocking the traffic.

What's next

To add, modify, query, delete, import, or export security group rules, see Manage security group rules.