Container network and node planning

更新时间:
复制 MD 格式

This topic describes how to plan network configurations and node specifications when you create a cluster.

Network plugins

When you create a cluster, you must select a network plugin in the network configuration section. The Flannel and Terway network plugins are supported.

  • Flannel: This option uses the simple and stable community Flannel Container Network Interface (CNI) plugin. It works with the high-speed Alibaba Cloud Virtual Private Cloud (VPC) network to provide a high-performance and stable container network. However, it provides only basic features and does not support standard Kubernetes network policies.

  • Terway: This is a network plugin developed by the ACK team. It assigns Alibaba Cloud Elastic Network Interfaces (ENIs) to containers. It supports standard Kubernetes network policies to define access policies between containers and also supports bandwidth throttling for individual containers.

Choose a plugin based on your business needs. If you do not need network policies, select the Flannel network plugin. Otherwise, select the Terway network plugin. For more information, see How to use the Terway network plugin.

Terway pod addresses

Define the Terway pod address space size based on the number of pods. These addresses must not conflict with node addresses or other private network addresses.

Node specifications

In practice, the node-to-pod ratio is typically between 1:4 and 1:6. For example, a typical 4-core Java application might run on a node with 16 to 64 cores. However, if you use the Terway network plugin, select newer-generation Elastic Compute Service (ECS) instance types with higher specifications. The maximum number of pods that a single node can support depends on the number of ENIs on that node.

  • Maximum number of pods for shared ENIs = (Number of ENIs - 1) × Number of private IP addresses per ENI

  • Maximum number of pods for exclusive ENIs = Number of ENIs - 1

In most cases, a node can support more than 20 pods, so this is not a major concern.

Overall, consider two main factors when you select worker node specifications:

  • Determine the total number of cores your cluster requires for daily use and your required fault tolerance. For example, assume your cluster requires 160 cores and can tolerate a 10% fault rate. In this case, select at least ten 16-core ECS instances. Ensure the peak load does not exceed 160 × 90% = 144 cores. If your fault tolerance is 20%, select at least five 32-core ECS instances. Ensure the peak load does not exceed 160 × 80% = 128 cores. This configuration ensures that if one ECS instance fails, the remaining instances can still support your services.

  • Determine the required CPU-to-memory ratio. For memory-intensive applications, such as Java applications, consider using instance types with a 1:8 ratio.

Large ECS instances offer the following advantages:

  • Higher network bandwidth improves resource utilization for high-bandwidth applications.

  • More containers can communicate within a single ECS instance, which reduces network traffic.

  • Image pulling is more efficient. An image is pulled only once and can be used by multiple containers on the same large instance. On smaller ECS instances, the image must be pulled more frequently. When the cluster needs to scale out, this frequent pulling takes more time and can delay a prompt response.

For more information about selecting instance types, see Select ECS instance types and Instance families.