Using a custom image with pre-installed software packages can significantly reduce the startup time for cloud nodes by minimizing package download time.
Prerequisites
-
You have created a registered cluster and connected a self-managed Kubernetes cluster from your on-premises data center to the registered cluster over a private network. For more information, see Create an ACK One registered cluster.
-
The network of your self-managed Kubernetes cluster in an on-premises data center is connected to the Virtual Private Cloud (VPC) that the registered cluster uses. For more information, see Use scenario-based networking to connect multiple VPCs.
-
You have activated Object Storage Service (OSS) and created a bucket. For more information, see Activate OSS and Create a bucket.
-
You have connected to the registered cluster by using kubectl. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Procedure
-
This topic uses a CentOS 7.9 operating system and a Kubernetes v1.28.3 cluster connected using binaries as an example of how to create a custom image for an elastic node pool.
-
If you already have a custom image, you can skip to Step 3.
Step 1: Create a node pool and add a node
-
Select an OSS bucket, create a file named
join-ecs-node.shwith the following content, and upload the file to the bucket.echo "The node providerid is $ALIBABA_CLOUD_PROVIDER_ID" echo "The node name is $ALIBABA_CLOUD_NODE_NAME" echo "The node labels are $ALIBABA_CLOUD_LABELS" echo "The node taints are $ALIBABA_CLOUD_TAINTS" -
Obtain the URL of the
join-ecs-node.shfile (you can use a signed URL), and then modify the custom script configuration in the cluster.-
Run the following command to edit the
ack-agent-configConfigMap:kubectl edit cm ack-agent-config -n kube-system -
Modify the
addNodeScriptPathfield. The updated configuration is as follows:apiVersion: v1 data: addNodeScriptPath: https://kubelet-****.oss-cn-hangzhou-internal.aliyuncs.com/join-ecs-nodes.sh kind: ConfigMap metadata: name: ack-agent-config namespace: kube-system
-
-
Create a node pool named
cloud-testand set Expected Nodes to 1. For more information, see Create and manage a node pool.ImportantThe new node will have a
Failedstatus because it lacks the required software packages. You must log on to this node for initialization, so ensure it is accessible through SSH.
Step 2: Configure node and export custom image
-
Log on to the node and run the following command to view the node information:
cat /var/log/acs/init.logExpected output:
The node providerid is cn-zhangjiakou.i-xxxxx The node name is cn-zhangjiakou.192.168.66.xx The node labels are alibabacloud.com/nodepool-id=npf9fbxxxxxx,ack.aliyun.com=c22b1a2e122ff4fde85117de4xxxxxx,alibabacloud.com/instance-id=i-8vb7m7nt3dxxxxxxx,alibabacloud.com/external=true The node taints areThis output confirms that the custom script can obtain the Alibaba Cloud node information. Record this information for the kubelet startup parameters.
-
Run the following commands to configure the base environment:
# Install tool packages. yum update -y && yum -y install wget psmisc vim net-tools nfs-utils telnet yum-utils device-mapper-persistent-data lvm2 git tar curl # Disable the firewall. systemctl disable --now firewalld # Disable SELinux. setenforce 0 sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config # Disable swap partitions. sed -ri 's/.*swap.*/#&/' /etc/fstab swapoff -a && sysctl -w vm.swappiness=0 # Configure the network. systemctl disable --now NetworkManager systemctl start network && systemctl enable network # Synchronize the time. ln -svf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime yum install ntpdate -y ntpdate ntp.aliyun.com # Configure ulimit. ulimit -SHn 65535 cat >> /etc/security/limits.conf <<EOF * soft nofile 655360 * hard nofile 131072 * soft nproc 655350 * hard nproc 655350 * seft memlock unlimited * hard memlock unlimitedd EOFNoteAfter you complete the preceding environment configuration, upgrade the kernel to version 4.18 or later and install ipvsadm.
-
Install containerd.
-
Run the following commands to download the CNI plugins and containerd packages:
wget https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz mkdir -p /etc/cni/net.d /opt/cni/bin # Decompress the CNI binary package. tar xf cni-plugins-linux-amd64-v*.tgz -C /opt/cni/bin/ wget https://github.com/containerd/containerd/releases/download/v1.7.8/containerd-1.7.8-linux-amd64.tar.gz tar -xzf containerd-*-linux-amd64.tar.gz -C / -
Run the following command to create the service startup configuration:
cat > /etc/systemd/system/containerd.service <<EOF [Unit] Description=containerd container runtime Documentation=https://containerd.io After=network.target local-fs.target [Service] ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/containerd Type=notify Delegate=yes KillMode=process Restart=always RestartSec=5 LimitNPROC=infinity LimitCORE=infinity LimitNOFILE=infinity TasksMax=infinity OOMScoreAdjust=-999 [Install] WantedBy=multi-user.target EOF -
Run the following command to configure the modules required by containerd:
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf overlay br_netfilter EOF systemctl restart systemd-modules-load.service -
Run the following command to configure the kernel parameters required by containerd:
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 EOF # Load the kernel parameters. sysctl --system -
Run the following commands to create a configuration file for containerd:
mkdir -p /etc/containerd containerd config default | tee /etc/containerd/config.toml # Modify the containerd configuration file. sed -i "s#SystemdCgroup\ \=\ false#SystemdCgroup\ \=\ true#g" /etc/containerd/config.toml cat /etc/containerd/config.toml | grep SystemdCgroup sed -i "s#registry.k8s.io#m.daocloud.io/registry.k8s.io#g" /etc/containerd/config.toml cat /etc/containerd/config.toml | grep sandbox_image sed -i "s#config_path\ \=\ \"\"#config_path\ \=\ \"/etc/containerd/certs.d\"#g" /etc/containerd/config.toml cat /etc/containerd/config.toml | grep certs.d # Configure a registry mirror. mkdir /etc/containerd/certs.d/docker.io -pv cat > /etc/containerd/certs.d/docker.io/hosts.toml << EOF server = "https://docker.io" [host."https://hub-mirror.c.163.com"] capabilities = ["pull", "resolve"] EOF -
Run the following commands to enable containerd to start on system startup:
systemctl daemon-reload # Reloads systemd units. Required after creating or modifying unit files (for example, .service or .socket). systemctl enable --now containerd.service systemctl start containerd.service systemctl status containerd.service -
Run the following commands to configure crictl:
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.28.0/crictl-v1.28.0-linux-amd64.tar.gz tar xf crictl-v*-linux-amd64.tar.gz -C /usr/bin/ # Generate the configuration file. cat > /etc/crictl.yaml <<EOF runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 debug: false EOF # Test the configuration. systemctl restart containerd crictl info
-
-
Install kubelet and kube-proxy.
-
Obtain the binary files. Log on to the master node and copy the binary files to the current node.
scp /usr/local/bin/kube{let,-proxy} $NODEIP:/usr/local/bin/ -
Obtain the certificates. Run the following command to create a certificate storage directory on the local machine:
mkdir -p /etc/kubernetes/pkiLog on to the master node and copy the certificates to the current node.
for FILE in pki/ca.pem pki/ca-key.pem pki/front-proxy-ca.pem bootstrap-kubelet.kubeconfig kube-proxy.kubeconfig; do scp /etc/kubernetes/$FILE $NODE:/etc/kubernetes/${FILE}; done -
Run the following command to configure the kubelet service. Use the Alibaba Cloud node pool variables that you obtained in Step 2.
mkdir -p /var/lib/kubelet /var/log/kubernetes /etc/systemd/system/kubelet.service.d /etc/kubernetes/manifests/ # Configure the kubelet service on all Kubernetes nodes. cat > /usr/lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=containerd.service [Service] ExecStart=/usr/local/bin/kubelet \\ --node-ip=${ALIBABA_CLOUD_NODE_NAME} \\ --hostname-override=${ALIBABA_CLOUD_NODE_NAME} \\ --node-labels=${ALIBABA_CLOUD_LABELS} \\ --provider-id=${ALIBABA_CLOUD_PROVIDER_ID} \\ --register-with-taints=${ALIBABA_CLOUD_TAINTS} \\ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --config=/etc/kubernetes/kubelet-conf.yml \\ --container-runtime-endpoint=unix:///run/containerd/containerd.sock [Install] WantedBy=multi-user.target EOF -
Run the following command to create the kubelet startup configuration file:
cat > /etc/kubernetes/kubelet-conf.yml <<EOF apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration address: 0.0.0.0 port: 10250 readOnlyPort: 10255 authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd cgroupsPerQOS: true clusterDNS: - 10.96.0.10 clusterDomain: cluster.local containerLogMaxFiles: 5 containerLogMaxSize: 10Mi contentType: application/vnd.kubernetes.protobuf cpuCFSQuota: true cpuManagerPolicy: none cpuManagerReconcilePeriod: 10s enableControllerAttachDetach: true enableDebuggingHandlers: true enforceNodeAllocatable: - pods eventBurst: 10 eventRecordQPS: 5 evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% evictionPressureTransitionPeriod: 5m0s failSwapOn: true fileCheckFrequency: 20s hairpinMode: promiscuous-bridge healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 20s imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 imageMinimumGCAge: 2m0s iptablesDropBit: 15 iptablesMasqueradeBit: 14 kubeAPIBurst: 10 kubeAPIQPS: 5 makeIPTablesUtilChains: true maxOpenFiles: 1000000 maxPods: 110 nodeStatusUpdateFrequency: 10s oomScoreAdj: -999 podPidsLimit: -1 registryBurst: 10 registryPullQPS: 5 resolvConf: /etc/resolv.conf rotateCertificates: true runtimeRequestTimeout: 2m0s serializeImagePulls: true staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 4h0m0s syncFrequency: 1m0s volumeStatsAggPeriod: 1m0s EOF -
Run the following commands to start the kubelet:
systemctl daemon-reload # Reloads systemd units. Required after creating or modifying unit files (for example, .service or .socket). systemctl enable --now kubelet.service systemctl start kubelet.service systemctl status kubelet.service -
Run the following command to view cluster information:
kubectl get node -
Log on to the master node and get the kubeconfig file required by kube-proxy.
scp /etc/kubernetes/kube-proxy.kubeconfig $NODE:/etc/kubernetes/kube-proxy.kubeconfig -
Run the following command to add the kube-proxy service configuration:
cat > /usr/lib/systemd/system/kube-proxy.service << EOF [Unit] Description=Kubernetes Kube Proxy Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-proxy \\ --config=/etc/kubernetes/kube-proxy.yaml \\ --v=2 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target EOF -
Run the following command to add the kube-proxy startup configuration:
cat > /etc/kubernetes/kube-proxy.yaml << EOF apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig qps: 5 clusterCIDR: 172.16.0.0/12,fc00:2222::/112 configSyncPeriod: 15m0s conntrack: max: null maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s ipvs: masqueradeAll: true minSyncPeriod: 5s scheduler: "rr" syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 127.0.0.1:10249 mode: "ipvs" nodePortAddresses: null oomScoreAdj: -999 portRange: "" udpIdleTimeout: 250ms EOF -
Run the following commands to start kube-proxy:
systemctl daemon-reload # Reloads systemd units. Required after creating or modifying unit files (for example, .service or .socket). systemctl enable --now kube-proxy.service systemctl restart kube-proxy.service systemctl status kube-proxy.service
-
-
Sync the node pool status.
-
Log on to the ACK console. In the left navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, click Sync Node Pool. After the synchronization is complete, verify that no failure messages are displayed and the node pool is in a normal state.
-
-
Export the custom image.
-
Log in to the ECS console.
-
In the left-side navigation pane, choose .
-
Click the Instance ID. On the Instance Details tab, click Create custom image.
-
In the left-side navigation pane, choose .
-
On the Images page, you can see the Custom Image that you created. The Status is Available.
-
Step 3: Modify or create node pool with custom image
If you already have a custom image and skipped Step 1 and Step 2, you must create a node pool using the custom image. For more information, see How do I create a custom image from an existing ECS instance and use the image to create nodes?.
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, find the target Node Pools and click Edit in the Actions column. On the Advanced tab, set the node pool image to Custom Image. Find the Custom image setting, click the Select custom image link, and select the custom image you created.
-
On the Node Pools page, you can see that the Operating System has been updated to Custom Image. After the update is complete, the Operating system column for the node pool displays custom image and its image ID, which indicates that the image has been successfully replaced.
Step 4: Update node init script for cloud parameters
-
You must remove the residual kubelet certificates from the custom image, as shown in the seventh line of the script.
-
For existing custom node pools, you must configure the download URL for the custom script as described in Step 1.
-
Create or update the
join-ecs-node.shfile with the following content. Because the custom image already contains the required tools and dependencies, the custom script only needs to receive and update the Alibaba Cloud node pool parameters.echo "The node providerid is $ALIBABA_CLOUD_PROVIDER_ID" echo "The node name is $ALIBABA_CLOUD_NODE_NAME" echo "The node labels are $ALIBABA_CLOUD_LABELS" echo "The node taints are $ALIBABA_CLOUD_TAINTS" systemctl stop kubelet.service echo "Delete old kubelet pki" # The old node certificates must be deleted. rm -rf /var/lib/kubelet/pki/* echo "Add kubelet service config" # Configure the kubelet service. cat > /usr/lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=containerd.service [Service] ExecStart=/usr/local/bin/kubelet \\ --node-ip=${ALIBABA_CLOUD_NODE_NAME} \\ --hostname-override=${ALIBABA_CLOUD_NODE_NAME} \\ --node-labels=${ALIBABA_CLOUD_LABELS} \\ --provider-id=${ALIBABA_CLOUD_PROVIDER_ID} \\ --register-with-taints=${ALIBABA_CLOUD_TAINTS} \\ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --config=/etc/kubernetes/kubelet-conf.yml \\ --container-runtime-endpoint=unix:///run/containerd/containerd.sock [Install] WantedBy=multi-user.target EOF systemctl daemon-reload # Start the kubelet service. systemctl start kubelet.service -
Upload the updated
join-ecs-node.shscript to OSS.
Step 5: Scale out the node pool
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, find the target node pool. In the Actions column, choose to add a new node.
Verify that both nodes are in a normal state. This confirms the successful creation of the elastic node pool.
-
You can configure an auto scaling policy for the node pool. For more information, see Configure auto scaling.