轮转ACK专有版集群etcd证书

为了确保服务的持续可用和安全,避免潜在的证书泄露或密钥破解带来的安全风险,在专有版集群中,建议您根据系统提醒及时轮转Master节点的etcd证书。本文介绍如何轮转ACK专有版集群Master节点的etcd证书。

背景信息

ACK专有版集群支持迁移至ACK Pro版集群,您可以选择将集群迁移到ACK Pro版集群,ACK Pro集群etcdKubernetes管控面证书由阿里云托管,原ACK专有版集群迁移完成后,无需进行以下轮转操作。迁移具体操作,请参见热迁移ACK专有集群至ACK集群Pro

注意事项

  • 容器服务 Kubernetes 版 ACK(Container Service for Kubernetes)会在etcd证书过期前两个月发送站内和短信过期提醒,并在集群列表页面显示更新ETCD证书

  • 轮转过程中,系统将会逐个节点(one by one)重启集群Master节点的apiserver、etcd、kcmkubelet等控制平面组件,其间对APIServer的长连接请求会发生断连,请在业务低峰期操作。轮转流程预计在30分钟内结束。

  • 如果您修改过专业版集群etcdKubernetes的默认配置文件目录,请建立软链接到原有目录后再进行轮转,否则会导致轮转失败。

  • 如果您通过手工方式轮转完成后,容器服务控制台依旧会显示更新ETCD证书的过期提示,请您提交工单,通过后台配置取消更新提示。

  • 轮转流程中,如遇任何问题导致轮转失败,请提交工单处理。

场景一:etcd证书未过期时轮转方案

etcd证书即将过期,提示需要更新时,您可以通过以下两种方式进行etcd证书轮转。

控制台自动化方式轮转etcd证书

  1. 登录容器服务管理控制台,在左侧导航栏选择集群

  2. 单击etcd证书即将过期集群右侧的更新ETCD证书,进入更新证书页面,然后单击更新证书

    说明

    若集群证书即将在两个月后过期,在对应集群右侧才会出现更新ETCD证书

    etcd

  3. 提示对话框,单击确定

    证书更新成功后,您可以看到以下内容:

    • 更新证书页面,显示更新成功

    • 集群列表页面,目标集群右侧无更新ETCD证书提示。

手工方式轮转etcd证书

使用场景

  • 专有版集群etcd证书即将过期。

  • 无法通过模板部署的方式自动化轮转etcd证书。

  • 无法通过控制台操作更新etcd证书。

当出现以上场景时,集群管理员可以登录任意Master节点,通过操作如下脚本来手工轮转etcd证书。

说明

以下脚本使用需要root用户执行。

  1. 确认集群Master节点之间配置了root用户的免密登录。

    Master上通过SSH方式登录其他任意Master节点,如果提示输入密码,请您参考如下方式配置Master节点之间的免密登录。

    # 1. 生成密钥。如果您的节点上已存在对应的登录密钥,可以跳过该步骤。
    ssh-keygen -t rsa
    
    # 2. 使用ssh-copy-id工具传输公钥到其他所有Master节点,$(internal-ip)为其他Master节点的内网IP。
    ssh-copy-id -i ~/.ssh/id_rsa.pub $(internal-ip)
    说明

    如果您未执行免密登录相关操作,在运行脚本时,则需要输入root用户密码。

  2. 分别复制以下脚本内容,保存并命名为restart-apiserver.shrotate-etcd.sh,然后将两者保存到同一个文件夹下。

    说明

    rotate-etcd.sh脚本会尝试通过访问节点的元数据服务获取Region信息并从该Region就近拉取轮转镜像,您也可以在执行该脚本时,输入参数--region xxxx指定Region信息。

    展开查看restart-apiserver.sh脚本

    #! /bin/bash
    
    declare -x cmd
    
    k8s::wait_apiserver_ready() {
      set -e
      for i in $(seq 600); do
        if kubectl cluster-info &>/dev/null; then
          return 0
        else
          echo "wait apiserver to be ready, retry ${i}th after 1s"
          sleep 1
        fi
      done
      echo "failed to wait apiserver to be ready"
      return 1
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    function restart_apiserver() {
      # 判断容器运行时
      if [[ $cmd == "docker" ]]; then
        # 使用docker命令重启kube-apiserver Pod
        container_id=$(docker ps | grep kube-apiserver | awk '{print $1}' | head -n 1 )
        if [[ -n $container_id ]]; then
          echo "Restarting kube-apiserver pod using Docker: $container_id"
          docker restart "${container_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      elif [[ $cmd == "crictl" ]]; then
        # 使用crictl命令重启kube-apiserver Pod
        pod_id=$(crictl pods --label component=kube-apiserver --latest --state=ready | grep -v "POD ID" | head -n 1 | awk '{print $1}')
        if [[ -n $pod_id ]]; then
          echo "Restarting kube-apiserver pod using crictl: $pod_id"
          crictl stopp "${pod_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      else
        echo "Unsupported container runtime: $cmd"
      fi
      k8s::wait_apiserver_ready
    }
    
    check_container_runtime
    restart_apiserver
    echo "API Server restarted"

    展开查看rotate-etcd.sh脚本

    #!/bin/bash
    
    set -eo pipefail
    
    declare -x TARGET_TEAR
    declare -x cmd
    dir=/tmp/etcdcert
    KUBE_CERT_PATH=/etc/kubernetes/pki
    ETCD_CERT_DIR=/var/lib/etcd/cert
    ETCD_HOSTS=""
    currentDir="$PWD"
    
    # 更新K8s证书,根据集群Region替换下面cn-hangzhou的默认镜像地域。
    function get_etcdhosts() {
      name1=$(find "$ETCD_CERT_DIR" -name '*-name-1.pem' -exec basename {} \; | sed 's/-name-1.pem//g')
      name2=$(find "$ETCD_CERT_DIR" -name '*-name-2.pem' -exec basename {} \; | sed 's/-name-2.pem//g')
      name3=$(find "$ETCD_CERT_DIR" -name '*-name-3.pem' -exec basename {} \; | sed 's/-name-3.pem//g')
    
      echo "hosts: $name1 $name2 $name3"
      ETCD_HOSTS="$name1 $name2 $name3"
    }
    
    function gencerts() {
      echo "generate ssl cert ..."
      rm -rf $dir
      mkdir -p "$dir"
    
      local hosts
      hosts=$(echo $ETCD_HOSTS | tr -s " " ",")
    
      echo "-----generate ca"
      echo '{"CN":"CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' |
        cfssl gencert -initca - | cfssljson -bare $dir/ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/ca-config.json
    
      echo "-----generate etcdserver"
      export ADDRESS=$hosts,ext1.example.com,coreos1.local,coreos1,127.0.0.1
      export NAME=etcd-server
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
      export ADDRESS=
      export NAME=etcd-client
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
    
      # gen peer-ca
      echo "-----generate peer certificates"
      echo '{"CN":"Peer-CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' | cfssl gencert -initca - | cfssljson -bare $dir/peer-ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/peer-ca-config.json
      i=0
      for host in $ETCD_HOSTS; do
        ((i = i + 1))
        export MEMBER=${host}-name-$i
        echo '{"CN":"'${MEMBER}'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
          cfssl gencert -ca=$dir/peer-ca.pem -ca-key=$dir/peer-ca-key.pem -config=$dir/peer-ca-config.json -profile=peer \
            -hostname="$hosts,${MEMBER}.local,${MEMBER}" - | cfssljson -bare $dir/${MEMBER}
      done
    
      # 制作bundle ca
      cat $KUBE_CERT_PATH/etcd/ca.pem >>$dir/bundle_ca.pem
      cat $ETCD_CERT_DIR/ca.pem >>$dir/bundle_ca.pem
      cat $dir/ca.pem >>$dir/bundle_ca.pem
    
      # 制作bundle peer-ca
      cat $ETCD_CERT_DIR/peer-ca.pem >$dir/bundle_peer-ca.pem
      cat $dir/peer-ca.pem >>$dir/bundle_peer-ca.pem
    
      current_year=$(date +%Y)
      TARGET_TEAR=$((TARGET_TEAR + 50))
    
      # chown
      chown -R etcd:etcd $dir
      chmod 0644 $dir/*
    }
    
    function etcd_client_urls() {
      local etcd_hosts=()
      for ip in "${ETCD_HOSTS[@]}"; do
        etcd_hosts+=("https://$ip:2379")
      done
      local result=$(
        IFS=','
        echo "${etcd_hosts[*]}"
      )
      echo "$result"
    }
    
    function check_cert_files_exist() {
      REQUIRED_CERTS=("ca.pem" "etcd-server-key.pem" "etcd-server.pem" "peer-ca-key.pem" "peer-ca.pem")
      if [ ! -d "$ETCD_CERT_DIR" ]; then
        echo "Error: Directory $ETCD_CERT_DIR does not exist"
        exit 1
      fi
    
      for cert_file in "${REQUIRED_CERTS[@]}"; do
        if [ ! -f "$ETCD_CERT_DIR/$cert_file" ]; then
          echo "Error: File $ETCD_CERT_DIR/$cert_file does not exist"
          exit 1
        fi
      done
    
      echo "All required certificate files exist"
    }
    
    function check_etcd_cluster_ready() {
      local etcd_endpoints=()
      for ip in $ETCD_HOSTS; do
        etcd_endpoints+=("https://$ip:2379")
      done
      ready=0
      for i in $(seq 300); do
        for idx in "${!etcd_endpoints[@]}"; do
          endpoint="${etcd_endpoints[$idx]}"
          local health_output=$(ETCDCTL_API=3 etcdctl --cacert=/var/lib/etcd/cert/ca.pem --cert=/var/lib/etcd/cert/etcd-server.pem --key=/var/lib/etcd/cert/etcd-server-key.pem --endpoints "$endpoint" endpoint health --command-timeout=1s 2>&1)
          if echo "$health_output" | grep -q "successfully committed proposal"; then
              unset 'etcd_endpoints[$idx]'
          else
              echo "etcdctl result: ${health_output}"
              echo "$endpoint is not ready"
          fi
        done
        # shellcheck disable=SC2199
        if [[ -z "${etcd_endpoints[@]}" ]]; then
          echo "ETCD cluster is ready"
          ready=1
          break
        fi
        printf "wait etcd cluster to be ready, retry %d after 1s,total 300s \n" "$i"
      done
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    function rotate_etcd_ca() {
      for ADDR in $ETCD_HOSTS; do
        echo "update etcd CA on node $ADDR"
        scp -o StrictHostKeyChecking=no $dir/bundle_ca.pem root@$ADDR:$ETCD_CERT_DIR/ca.pem
        scp -o StrictHostKeyChecking=no $dir/bundle_ca.pem root@$ADDR:$KUBE_CERT_PATH/etcd/ca.pem
        scp -o StrictHostKeyChecking=no $dir/etcd-client.pem root@$ADDR:$KUBE_CERT_PATH/etcd/etcd-client.pem
        scp -o StrictHostKeyChecking=no $dir/etcd-client-key.pem root@$ADDR:$KUBE_CERT_PATH/etcd/etcd-client-key.pem
        scp -o StrictHostKeyChecking=no $dir/bundle_peer-ca.pem root@$ADDR:$ETCD_CERT_DIR/peer-ca.pem
    
        ssh -o StrictHostKeyChecking=no root@$ADDR chown -R etcd:etcd $ETCD_CERT_DIR
        ssh -o StrictHostKeyChecking=no root@$ADDR chmod 0644 $ETCD_CERT_DIR/*
        echo "restart etcd on node $ADDR"
        ssh -o StrictHostKeyChecking=no root@$ADDR systemctl restart etcd
        echo "etcd on node $ADDR restarted"
    
        # 校验etcd是否启动成功,校验集群是否正常
        echo "check connectivity for etcd nodes"
        check_etcd_cluster_ready
        echo "end to check connectivity for etcd nodes"
        restart_one_apiserver $ADDR
        echo "apiserver on node $ADDR restarted"
      done
    }
    
    function rotate_etcd_certs() {
      for ADDR in $ETCD_HOSTS; do
        echo "update etcd peer certs on node $ADDR"
        scp -o StrictHostKeyChecking=no \
          $dir/{peer-ca-key.pem,etcd-server.pem,etcd-server-key.pem,etcd-client.pem,etcd-client-key.pem,ca-key.pem,*-name*.pem} root@$ADDR:$ETCD_CERT_DIR/
    
        ssh -o StrictHostKeyChecking=no root@$ADDR chown -R etcd:etcd $ETCD_CERT_DIR
    
        ssh -o StrictHostKeyChecking=no root@$ADDR \
          chmod 0400 $ETCD_CERT_DIR/{peer-ca-key.pem,etcd-server.pem,etcd-server-key.pem,etcd-client.pem,etcd-client-key.pem,ca-key.pem,*-name*.pem}
    
        echo "restart etcd on node $ADDR"
        ssh -o StrictHostKeyChecking=no root@$ADDR systemctl restart etcd
        echo "etcd on node $ADDR restarted"
        echo "check connectivity for etcd nodes"
        check_etcd_cluster_ready
        echo "end to check connectivity for etcd nodes"
      done
    }
    
    function recover_etcd_ca() {
      # Update certs on etcd nodes.
      for ADDR in $ETCD_HOSTS; do
        echo "replace etcd CA on node $ADDR"
        scp -o StrictHostKeyChecking=no $dir/ca.pem root@$ADDR:$ETCD_CERT_DIR/ca.pem
        scp -o StrictHostKeyChecking=no $dir/ca.pem root@$ADDR:$KUBE_CERT_PATH/etcd/ca.pem
        scp -o StrictHostKeyChecking=no $dir/ca.pem root@$ADDR:$KUBE_CERT_PATH/etcd/ca.pem
        scp -o StrictHostKeyChecking=no $dir/peer-ca.pem root@$ADDR:$ETCD_CERT_DIR/peer-ca.pem
        ssh -o StrictHostKeyChecking=no root@$ADDR chown -R etcd:etcd $ETCD_CERT_DIR
        echo "restart apiserver on node $ADDR"
        restart_one_apiserver $ADDR
        echo "apiserver on node $ADDR restarted"
        echo "restart etcd on node $ADDR"
        ssh -o StrictHostKeyChecking=no root@$ADDR systemctl restart etcd
        echo "etcd on node $ADDR restarted"
        echo "check connectivity for etcd nodes"
        check_etcd_cluster_ready
        echo "end to check connectivity for etcd nodes"
        sleep 5
      done
    }
    
    function recover_etcd_client_ca() {
      # Update certs on etcd nodes.
      for ADDR in $ETCD_HOSTS; do
        echo "replace etcd CA on node $ADDR"
        scp -o StrictHostKeyChecking=no $dir/ca.pem root@$ADDR:$KUBE_CERT_PATH/etcd/ca.pem
        scp -o StrictHostKeyChecking=no $dir/ca.pem root@$ADDR:$KUBE_CERT_PATH/etcd/ca.pem
      done
    }
    
    function renew_k8s_certs() {
      # try to get region id from meta-server if not given in parameter
      META_REGION=$(get_region_id)
      if [[ -z "$REGION" ]]; then
        if [[ -z "$META_REGION" ]]; then
            echo "failed to get region id from ECS meta-server, please enter the region parameter."
            return 1
        fi
        REGION=$META_REGION
      elif [[ -n "${META_REGION}" && "$REGION" != "$META_REGION" ]] ; then
        echo "switch to use local region id $META_REGION"
        REGION=$META_REGION
      fi
      # Update certs for k8s components and kubeconfig
      for ADDR in $ETCD_HOSTS; do
        echo "renew k8s components cert on node $ADDR"
        #compatible containerd
        set +e
        IMAGE="registry.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        if is_vpc; then
          IMAGE="registry-vpc.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        fi
        echo "will pull rotate image $IMAGE"
        ssh -o StrictHostKeyChecking=no root@$ADDR docker run --privileged=true  -v /:/alicoud-k8s-host --pid host --net host \
                 $IMAGE /renew/upgrade-k8s.sh --role master
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr image pull $IMAGE
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr run --privileged=true --mount type=bind,src=/,dst=/alicoud-k8s-host,options=rbind:rw \
                --net-host $IMAGE cert-rotate /renew/upgrade-k8s.sh --role master
        set -e
        echo "finished renew k8s components cert on $ADDR"
      done
    }
    
    function get_region_id() {
        set +e; # close error out
        local path=100.100.100.200/latest/meta-data/region-id
        for (( i=0; i<3; i++));
        do
            response=$(curl --retry 1 --retry-delay 5 -sSL $path)
            if [[ $? -gt 0 || "x$response" == "x" ]];
            then
                sleep 2; continue
            fi
            if echo "$response"|grep -E "<title>.*</title>" >/dev/null;
            then
                sleep 3; continue
            fi
            echo "$response"
            # return from metadata succeed.
            set -e; return
        done
        set -e # open error out
        # function will return empty string when failed
    }
    
    function is_vpc() {
        # Execute the curl command and capture the network-type from ECS meta-server
        response=$(curl -s http://100.100.100.200/latest/meta-data/network-type)
        if [ "$response" = "vpc" ]; then
          return 0
        else
          return 1
        fi
    }
    
    function generate_cm() {
      echo "generate status configmap"
    
      cat <<-"EOF" >/tmp/ack-rotate-etcd-ca-cm.yaml.tpl
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-rotate-etcd-status
      namespace: kube-system
    data:
      status: "success"
      hosts: "$hosts"
    EOF
    
      sed -e "s#\$hosts#$ETCD_HOSTS#" /tmp/ack-rotate-etcd-ca-cm.yaml.tpl | kubectl apply -f -
    }
    
    function restart_one_apiserver() {
      ADDR=$1
      if [[ -z "${ADDR}" ]]; then
        printf "ADDR is empty,exit."
        exit 1
      fi
      printf "restart apiserver on node %s\n" "${ADDR}"
      scp -o StrictHostKeyChecking=no "${currentDir}"/restart-apiserver.sh root@"${ADDR}":/tmp/restart-apiserver.sh
      ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" chmod +x /tmp/restart-apiserver.sh
      ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" bash /tmp/restart-apiserver.sh
    }
    
    while
        [[ $# -gt 0 ]]
    do
        key="$1"
    
        case $key in
        --region)
          export REGION=$2
          shift
          ;;
        *)
          echo "unknown option [$key]"
          exit 1
          ;;
        esac
        shift
    done
    
    get_etcdhosts
    echo "${ETCD_HOSTS[@]}"
    
    check_container_runtime
    
    # Update certs on etcd nodes.
    echo "---restart runtime and kubelet on master nodes---"
    for ADDR in $ETCD_HOSTS; do
      if [ "$cmd" == "docker" ]; then
        echo "restart docker on node $ADDR"
        ssh -o StrictHostKeyChecking=no root@$ADDR systemctl restart docker
      fi
      ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" systemctl restart kubelet
    done
    sleep 5
    echo "---end to restart runtime and kubelet on master nodes---"
    
    echo "---renew k8s components certs---"
    renew_k8s_certs
    echo "---end to renew k8s components certs---"
    
    echo "---check cert files exist---"
    check_cert_files_exist
    echo "---end to check cert files exist---"
    
    echo "---check connectivity for etcd nodes---"
    check_etcd_cluster_ready
    echo "---end to check connectivity for etcd nodes---"
    
    # Update certs on etcd nodes.
    for ADDR in $ETCD_HOSTS; do
      scp -o StrictHostKeyChecking=no restart-apiserver.sh root@$ADDR:/tmp/restart-apiserver.sh
      ssh -o StrictHostKeyChecking=no root@$ADDR chmod +x /tmp/restart-apiserver.sh
    done
    
    gencerts
    
    echo "---rotate etcd ca and etcd client ca---"
    rotate_etcd_ca
    echo "---end to rotate etcd ca and etcd client ca---"
    
    echo "---rotate etcd peer and certs---"
    rotate_etcd_certs
    echo "---end to rotate etcd peer and certs---"
    
    echo "check etcd cluster ready"
    check_etcd_cluster_ready
    
    echo "---replace etcd ca---"
    recover_etcd_ca
    echo "---end to replace etcd ca---"
    
    generate_cm
    echo "etcd CA and certs have succesfully rotated!"
  3. 在任意Master节点上运行bash rotate-etcd.sh

    当看到命令行输出etcd CA and certs have successfully rotated!时,表示所有Master节点上的证书和K8s证书已经轮转完成。

  4. 验证证书是否更新。

    cd /var/lib/etcd/cert
    for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done
    
    
    cd /etc/kubernetes/pki/etcd
    for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done
    
    
    cd /etc/kubernetes/pki/
    for i in `ls | grep crt| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done
    说明
    • 当以上脚本输出的时间在50年之后,表示轮转完成。

    • 通过手工方式轮转成功后,由于容器服务控制面侧无法获取轮转结果,控制台集群列表中对应集群仍会显示更新按钮,请您提交工单以清除该按钮。

场景二:etcd证书已过期时轮转方案

使用场景

  • etcd证书已过期。

  • APIServer无法访问时轮转etcd证书。

  • 无法通过模板部署的方式自动化轮转etcd证书。

  • 无法通过控制台操作更新etcd证书。

当出现以上场景时,集群管理员可以登录任意Master节点,通过操作如下脚本来手工轮转etcd证书。

说明

以下脚本使用需要root用户执行。

  1. 确认集群Master节点之间配置了root用户的免密登录。

    Master上通过SSH方式登录其他任意Master节点,如果提示输入密码,请您参考如下方式配置Master节点之间的免密登录。

    # 1. 生成密钥。如果您的节点上已存在对应的登录密钥,可以跳过该步骤。
    ssh-keygen -t rsa
    
    # 2. 使用ssh-copy-id工具传输公钥到其他所有Master节点,$(internal-ip)为其他Master节点的内网IP。
    ssh-copy-id -i ~/.ssh/id_rsa.pub $(internal-ip)
    说明

    如果您未执行免密登录相关操作,在运行脚本时,则需要输入root用户密码。

  2. 分别复制以下脚本内容,保存并命名为restart-apiserver.shrotate-etcd.sh,然后将两者保存到同一个文件夹下。

    说明

    rotate-etcd.sh脚本会尝试通过访问节点的元数据服务获取Region信息并从该Region就近拉取轮转镜像,您也可以在执行该脚本时,输入参数--region xxxx指定Region信息。

    展开查看restart-apiserver.sh脚本

    #! /bin/bash
    
    declare -x cmd
    
    k8s::wait_apiserver_ready() {
      set -e
      for i in $(seq 600); do
        if kubectl cluster-info &>/dev/null; then
          return 0
        else
          echo "wait apiserver to be ready, retry ${i}th after 1s"
          sleep 1
        fi
      done
      echo "failed to wait apiserver to be ready"
      return 1
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    function restart_apiserver() {
      # 判断容器运行时
      if [[ $cmd == "docker" ]]; then
        # 使用docker命令重启kube-apiserver Pod
        container_id=$(docker ps | grep kube-apiserver | awk '{print $1}' | head -n 1 )
        if [[ -n $container_id ]]; then
          echo "Restarting kube-apiserver pod using Docker: $container_id"
          docker restart "${container_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      elif [[ $cmd == "crictl" ]]; then
        # 使用crictl命令重启kube-apiserver Pod
        pod_id=$(crictl pods --label component=kube-apiserver --latest --state=ready | grep -v "POD ID" | head -n 1 | awk '{print $1}')
        if [[ -n $pod_id ]]; then
          echo "Restarting kube-apiserver pod using crictl: $pod_id"
          crictl stopp "${pod_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      else
        echo "Unsupported container runtime: $cmd"
      fi
      k8s::wait_apiserver_ready
    }
    
    check_container_runtime
    restart_apiserver
    echo "API Server restarted"

    展开查看rotate-etcd.sh脚本

    #!/bin/bash
    
    set -eo pipefail
    
    declare -x TARGET_TEAR
    declare -x cmd
    dir=/tmp/rollback/etcdcert
    KUBE_CERT_PATH=/etc/kubernetes/pki
    ETCD_CERT_DIR=/var/lib/etcd/cert
    ETCD_HOSTS=""
    currentDir="$PWD"
    
    # 更新K8s证书,根据集群Region替换下面cn-hangzhou的默认镜像地域。
    function get_etcdhosts() {
      name1=$(find "$ETCD_CERT_DIR" -name '*-name-1.pem' -exec basename {} \; | sed 's/-name-1.pem//g')
      name2=$(find "$ETCD_CERT_DIR" -name '*-name-2.pem' -exec basename {} \; | sed 's/-name-2.pem//g')
      name3=$(find "$ETCD_CERT_DIR" -name '*-name-3.pem' -exec basename {} \; | sed 's/-name-3.pem//g')
    
      echo "hosts: $name1 $name2 $name3"
      ETCD_HOSTS="$name1 $name2 $name3"
    }
    
    function gencerts() {
      echo "generate ssl cert ..."
      rm -rf $dir
      mkdir -p "$dir"
      cd $dir
    
      local hosts
      hosts=$(echo $ETCD_HOSTS | tr -s " " ",")
    
      echo "generate ca"
      echo '{"CN":"CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' |
        cfssl gencert -initca - | cfssljson -bare $dir/ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/ca-config.json
    
      echo "generate etcd server certificates"
      export ADDRESS=$hosts,ext1.example.com,coreos1.local,coreos1,127.0.0.1
      export NAME=etcd-server
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
      export ADDRESS=
      export NAME=etcd-client
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
    
      # gen peer-ca
      echo "generate peer certificates"
      echo '{"CN":"Peer-CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' | cfssl gencert -initca - | cfssljson -bare $dir/peer-ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/peer-ca-config.json
      i=0
      for host in $ETCD_HOSTS; do
        ((i = i + 1))
        export MEMBER=${host}-name-$i
        echo '{"CN":"'${MEMBER}'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
          cfssl gencert -ca=$dir/peer-ca.pem -ca-key=$dir/peer-ca-key.pem -config=$dir/peer-ca-config.json -profile=peer \
            -hostname="$hosts,${MEMBER}.local,${MEMBER}" - | cfssljson -bare $dir/${MEMBER}
      done
    
      # chown
      chown -R etcd:etcd $dir
      chmod 0644 $dir/*
    
      for ADDR in $ETCD_HOSTS; do
        printf "sync the certificates of node %s" "${ADDR}"
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" mkdir -p "${dir}"
        scp -o StrictHostKeyChecking=no "${dir}"/* root@"${ADDR}":/var/lib/etcd/cert/
        scp -o StrictHostKeyChecking=no "${dir}"/ca.pem "${dir}"/etcd-client.pem "${dir}"/etcd-client-key.pem root@"${ADDR}":/etc/kubernetes/pki/etcd/
      done
    }
    
    function generate_cm() {
      echo "generate status configmap"
    
      cat <<-"EOF" >/tmp/ack-rotate-etcd-ca-cm.yaml.tpl
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-rotate-etcd-status
      namespace: kube-system
    data:
      status: "success"
      hosts: "$hosts"
    EOF
    
      sed -e "s#\$hosts#$ETCD_HOSTS#" /tmp/ack-rotate-etcd-ca-cm.yaml.tpl | kubectl apply -f -
    }
    
    function rotate_etcd() {
      for ADDR in $ETCD_HOSTS; do
        printf "rotate etcd's certificates on node %s\n" "${ADDR}"
          if [ "$cmd" == "docker" ]; then
            echo "restart docker on node $ADDR"
            ssh -e none -o StrictHostKeyChecking=no root@$ADDR systemctl restart docker
          fi
        ssh -e none -o StrictHostKeyChecking=no root@$ADDR systemctl restart etcd
      done
    }
    
    function rotate_apiserver() {
      echo "current dir: $currentDir"
      for ADDR in $ETCD_HOSTS; do
        printf "restart apiserver on node %s\n" "${ADDR}"
        scp -o StrictHostKeyChecking=no "${currentDir}"/restart-apiserver.sh root@"${ADDR}":/tmp/restart-apiserver.sh
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" systemctl restart kubelet
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" chmod +x /tmp/restart-apiserver.sh
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" bash /tmp/restart-apiserver.sh
      done
    }
    
    function check_etcd_cluster_ready() {
      local etcd_endpoints=()
      for ip in $ETCD_HOSTS; do
        etcd_endpoints+=("https://$ip:2379")
      done
    
      for i in $(seq 300); do
        for idx in "${!etcd_endpoints[@]}"; do
          endpoint="${etcd_endpoints[$idx]}"
          local health_output=$(ETCDCTL_API=3 etcdctl --cacert=/var/lib/etcd/cert/ca.pem --cert=/var/lib/etcd/cert/etcd-server.pem --key=/var/lib/etcd/cert/etcd-server-key.pem --endpoints "$endpoint" endpoint health --command-timeout=1s 2>&1)
          if echo "$health_output" | grep -q "successfully committed proposal"; then
              unset 'etcd_endpoints[$idx]'
          else
              echo "etcdctl result: ${health_output}"
              echo "$endpoint is not ready"
          fi
        done
        # shellcheck disable=SC2199
        if [[ -z "${etcd_endpoints[@]}" ]]; then
          echo "ETCD cluster is ready"
          break
        fi
        sleep 1
        printf "wait etcd cluster to be ready, retry %d after 1s,total 300s \n" "$i"
      done
    }
    
    function get_region_id() {
        set +e; # close error out
        local path=100.100.100.200/latest/meta-data/region-id
        for (( i=0; i<3; i++));
        do
            response=$(curl --retry 1 --retry-delay 5 -sSL $path)
            if [[ $? -gt 0 || "x$response" == "x" ]];
            then
                sleep 2; continue
            fi
            if echo "$response"|grep -E "<title>.*</title>" >/dev/null;
            then
                sleep 3; continue
            fi
            echo "$response"
            # return from metadata succeed.
            set -e; return
        done
        set -e # open error out
        # function will return empty string when failed
    }
    
    function is_vpc() {
        # Execute the curl command and capture the network-type from ECS meta-server
        response=$(curl -s http://100.100.100.200/latest/meta-data/network-type)
        if [ "$response" = "vpc" ]; then
          return 0
        else
          return 1
        fi
    }
    
    function renew_k8s_certs() {
      # try to get region id from meta-server if not given in parameter
      META_REGION=$(get_region_id)
      if [[ -z "$REGION" ]]; then
        if [[ -z "$META_REGION" ]]; then
            echo "failed to get region id from ECS meta-server, please enter the region parameter."
            return 1
        fi
        REGION=$META_REGION
      elif [[ -n "${META_REGION}" && "$REGION" != "$META_REGION" ]] ; then
        echo "switch to use local region id $META_REGION"
        REGION=$META_REGION
      fi
      # Update certs for k8s components and kubeconfig
      for ADDR in $ETCD_HOSTS; do
        echo "renew k8s components cert on node $ADDR"
        #compatible containerd
        set +e
        IMAGE="registry.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        if is_vpc; then
          IMAGE="registry-vpc.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        fi
        echo "will pull rotate image $IMAGE"
        ssh -o StrictHostKeyChecking=no root@$ADDR docker run --privileged=true  -v /:/alicoud-k8s-host --pid host --net host \
                 $IMAGE /renew/upgrade-k8s.sh --role master
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr image pull $IMAGE
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr run --privileged=true --mount type=bind,src=/,dst=/alicoud-k8s-host,options=rbind:rw \
                --net-host $IMAGE cert-rotate /renew/upgrade-k8s.sh --role master
        set -e
        echo "finished renew k8s components cert on $ADDR"
      done
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    while
        [[ $# -gt 0 ]]
    do
        key="$1"
    
        case $key in
        --region)
          export REGION=$2
          shift
          ;;
        *)
          echo "unknown option [$key]"
          exit 1
          ;;
        esac
        shift
    done
    
    get_etcdhosts
    printf "ETCD_HOSTS: %s\n" "$ETCD_HOSTS"
    
    gencerts
    echo "---generate certificates successfully---"
    
    rotate_etcd
    echo "---rotate etcd successfully---"
    
    echo "---check etcd cluster ready---"
    check_etcd_cluster_ready
    
    rotate_apiserver
    echo "---restart apiserver successfully---"
    
    echo "---renew k8s components certs---"
    renew_k8s_certs
    echo "---end to renew k8s components certs---"
    
    generate_cm
    echo "etcd CA and certs have successfully rotated!"
    
    rm -rf $dir
  1. 验证证书是否更新。

cd /var/lib/etcd/cert
for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done


cd /etc/kubernetes/pki/etcd
for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done


cd /etc/kubernetes/pki/
for i in `ls | grep crt| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done
说明
  • 当以上脚本输出的时间在50年之后,表示轮转完成。

  • 通过手工方式轮转成功后,由于容器服务控制面侧无法获取轮转结果,控制台集群列表中对应集群仍会显示已过期状态,请您提交工单以清除过期状态显示。

证书轮转失败后回滚

使用场景

  • 通过云控制台证书轮转失败,恢复K8s集群。

  • 通过黑屏方式证书轮转失败,恢复K8s集群。

当出现以上场景时,集群管理员可以登录任意Master节点,通过操作如下脚本来手工更新etcd证书,因老证书即将过期,此操作会新生成一套etcd证书,并更新etcd server证书和kube-apiserverclient证书。

说明

以下脚本使用需要root用户执行。

  1. 确认集群Master节点之间配置了root用户的免密登录。

    Master上通过SSH方式登录其他任意Master节点,如果提示输入密码,请您参考如下方式配置Master节点之间的免密登录。

    # 1. 生成密钥。如果您的节点上已存在对应的登录密钥,可以跳过该步骤。
    ssh-keygen -t rsa
    
    # 2. 使用ssh-copy-id工具传输公钥到其他所有Master节点,$(internal-ip)为其他Master节点的内网IP。
    ssh-copy-id -i ~/.ssh/id_rsa.pub $(internal-ip)
    说明

    如果您未执行免密登录相关操作,在运行脚本时,则需要输入root用户密码。

  2. 分别复制以下脚本内容,保存并命名为restart-apiserver.shrollback-etcd.sh,然后将两者保存到同一个文件夹

    说明

    rollback-etcd.sh脚本会尝试通过访问节点的元数据服务获取Region信息并从该Region就近拉取轮转镜像,您也可以在执行该脚本时,输入参数--region xxxx指定Region信息。

    展开查看restart-apiserver.sh脚本

    #! /bin/bash
    
    declare -x cmd
    
    k8s::wait_apiserver_ready() {
      set -e
      for i in $(seq 600); do
        if kubectl cluster-info &>/dev/null; then
          return 0
        else
          echo "wait apiserver to be ready, retry ${i}th after 1s"
          sleep 1
        fi
      done
      echo "failed to wait apiserver to be ready"
      return 1
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    function restart_apiserver() {
      # 判断容器运行时
      if [[ $cmd == "docker" ]]; then
        # 使用docker命令重启kube-apiserver Pod
        container_id=$(docker ps | grep kube-apiserver | awk '{print $1}' | head -n 1 )
        if [[ -n $container_id ]]; then
          echo "Restarting kube-apiserver pod using Docker: $container_id"
          docker restart "${container_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      elif [[ $cmd == "crictl" ]]; then
        # 使用crictl命令重启kube-apiserver Pod
        pod_id=$(crictl pods --label component=kube-apiserver --latest --state=ready | grep -v "POD ID" | head -n 1 | awk '{print $1}')
        if [[ -n $pod_id ]]; then
          echo "Restarting kube-apiserver pod using crictl: $pod_id"
          crictl stopp "${pod_id}"
        else
          echo "kube-apiserver pod not found."
        fi
      else
        echo "Unsupported container runtime: $cmd"
      fi
      k8s::wait_apiserver_ready
    }
    
    check_container_runtime
    restart_apiserver
    echo "API Server restarted"

    展开查看rollback-etcd.sh脚本

    #!/bin/bash
    
    set -eo pipefail
    
    declare -x TARGET_TEAR
    declare -x cmd
    dir=/tmp/rollback/etcdcert
    KUBE_CERT_PATH=/etc/kubernetes/pki
    ETCD_CERT_DIR=/var/lib/etcd/cert
    ETCD_HOSTS=""
    currentDir="$PWD"
    
    # 更新K8s证书,根据集群Region替换下面cn-hangzhou的默认镜像地域。
    function get_etcdhosts() {
      name1=$(find "$ETCD_CERT_DIR" -name '*-name-1.pem' -exec basename {} \; | sed 's/-name-1.pem//g')
      name2=$(find "$ETCD_CERT_DIR" -name '*-name-2.pem' -exec basename {} \; | sed 's/-name-2.pem//g')
      name3=$(find "$ETCD_CERT_DIR" -name '*-name-3.pem' -exec basename {} \; | sed 's/-name-3.pem//g')
    
      echo "hosts: $name1 $name2 $name3"
      ETCD_HOSTS="$name1 $name2 $name3"
    }
    
    function gencerts() {
      echo "generate ssl cert ..."
      rm -rf $dir
      mkdir -p "$dir"
      cd $dir
    
      local hosts
      hosts=$(echo $ETCD_HOSTS | tr -s " " ",")
    
      echo "generate ca"
      echo '{"CN":"CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' |
        cfssl gencert -initca - | cfssljson -bare $dir/ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/ca-config.json
    
      echo "generate etcd server certificates"
      export ADDRESS=$hosts,ext1.example.com,coreos1.local,coreos1,127.0.0.1
      export NAME=etcd-server
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
      export ADDRESS=
      export NAME=etcd-client
      echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
        cfssl gencert -config=$dir/ca-config.json -ca=$dir/ca.pem -ca-key=$dir/ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $dir/$NAME
    
      # gen peer-ca
      echo "generate peer certificates"
      echo '{"CN":"Peer-CA","key":{"algo":"rsa","size":2048}, "ca": {"expiry": "438000h"}}' | cfssl gencert -initca - | cfssljson -bare $dir/peer-ca -
      echo '{"signing":{"default":{"expiry":"438000h","usages":["signing","key encipherment","server auth","client auth"]}}}' >$dir/peer-ca-config.json
      i=0
      for host in $ETCD_HOSTS; do
        ((i = i + 1))
        export MEMBER=${host}-name-$i
        echo '{"CN":"'${MEMBER}'","hosts":[""],"key":{"algo":"rsa","size":2048}}' |
          cfssl gencert -ca=$dir/peer-ca.pem -ca-key=$dir/peer-ca-key.pem -config=$dir/peer-ca-config.json -profile=peer \
            -hostname="$hosts,${MEMBER}.local,${MEMBER}" - | cfssljson -bare $dir/${MEMBER}
      done
    
      # chown
      chown -R etcd:etcd $dir
      chmod 0644 $dir/*
    
      for ADDR in $ETCD_HOSTS; do
        printf "sync the certificates of node %s" "${ADDR}"
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" mkdir -p "${dir}"
        scp -o StrictHostKeyChecking=no "${dir}"/* root@"${ADDR}":/var/lib/etcd/cert/
        scp -o StrictHostKeyChecking=no "${dir}"/ca.pem "${dir}"/etcd-client.pem "${dir}"/etcd-client-key.pem root@"${ADDR}":/etc/kubernetes/pki/etcd/
      done
    }
    
    function generate_cm() {
      echo "generate status configmap"
    
      cat <<-"EOF" >/tmp/ack-rotate-etcd-ca-cm.yaml.tpl
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-rotate-etcd-status
      namespace: kube-system
    data:
      status: "success"
      hosts: "$hosts"
    EOF
    
      sed -e "s#\$hosts#$ETCD_HOSTS#" /tmp/ack-rotate-etcd-ca-cm.yaml.tpl | kubectl apply -f -
    }
    
    function rotate_etcd() {
      for ADDR in $ETCD_HOSTS; do
        printf "rotate etcd's certificates on node %s\n" "${ADDR}"
          if [ "$cmd" == "docker" ]; then
            echo "restart docker on node $ADDR"
            ssh -e none -o StrictHostKeyChecking=no root@$ADDR systemctl restart docker
          fi
        ssh -e none -o StrictHostKeyChecking=no root@$ADDR systemctl restart etcd
      done
    }
    
    function rotate_apiserver() {
      echo "current dir: $currentDir"
      for ADDR in $ETCD_HOSTS; do
        printf "restart apiserver on node %s\n" "${ADDR}"
        scp -o StrictHostKeyChecking=no "${currentDir}"/restart-apiserver.sh root@"${ADDR}":/tmp/restart-apiserver.sh
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" systemctl restart kubelet
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" chmod +x /tmp/restart-apiserver.sh
        ssh -e none -o StrictHostKeyChecking=no root@"${ADDR}" bash /tmp/restart-apiserver.sh
      done
    }
    
    function check_etcd_cluster_ready() {
      local etcd_endpoints=()
      for ip in $ETCD_HOSTS; do
        etcd_endpoints+=("https://$ip:2379")
      done
    
      for i in $(seq 300); do
        for idx in "${!etcd_endpoints[@]}"; do
          endpoint="${etcd_endpoints[$idx]}"
          local health_output=$(ETCDCTL_API=3 etcdctl --cacert=/var/lib/etcd/cert/ca.pem --cert=/var/lib/etcd/cert/etcd-server.pem --key=/var/lib/etcd/cert/etcd-server-key.pem --endpoints "$endpoint" endpoint health --command-timeout=1s 2>&1)
          if echo "$health_output" | grep -q "successfully committed proposal"; then
              unset 'etcd_endpoints[$idx]'
          else
              echo "etcdctl result: ${health_output}"
              echo "$endpoint is not ready"
          fi
        done
        # shellcheck disable=SC2199
        if [[ -z "${etcd_endpoints[@]}" ]]; then
          echo "ETCD cluster is ready"
          break
        fi
        sleep 1
        printf "wait etcd cluster to be ready, retry %d after 1s,total 300s \n" "$i"
      done
    }
    
    function get_region_id() {
        set +e; # close error out
        local path=100.100.100.200/latest/meta-data/region-id
        for (( i=0; i<3; i++));
        do
            response=$(curl --retry 1 --retry-delay 5 -sSL $path)
            if [[ $? -gt 0 || "x$response" == "x" ]];
            then
                sleep 2; continue
            fi
            if echo "$response"|grep -E "<title>.*</title>" >/dev/null;
            then
                sleep 3; continue
            fi
            echo "$response"
            # return from metadata succeed.
            set -e; return
        done
        set -e # open error out
        # function will return empty string when failed
    }
    
    function is_vpc() {
        # Execute the curl command and capture the network-type from ECS meta-server
        response=$(curl -s http://100.100.100.200/latest/meta-data/network-type)
        if [ "$response" = "vpc" ]; then
          return 0
        else
          return 1
        fi
    }
    
    function renew_k8s_certs() {
      # try to get region id from meta-server if not given in parameter
      META_REGION=$(get_region_id)
      if [[ -z "$REGION" ]]; then
        if [[ -z "$META_REGION" ]]; then
            echo "failed to get region id from ECS meta-server, please enter the region parameter."
            return 1
        fi
        REGION=$META_REGION
      elif [[ -n "${META_REGION}" && "$REGION" != "$META_REGION" ]] ; then
        echo "switch to use local region id $META_REGION"
        REGION=$META_REGION
      fi
      # Update certs for k8s components and kubeconfig
      for ADDR in $ETCD_HOSTS; do
        echo "renew k8s components cert on node $ADDR"
        #compatible containerd
        set +e
        IMAGE="registry.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        if is_vpc; then
          IMAGE="registry-vpc.$REGION.aliyuncs.com/acs/etcd-rotate:v2.0.0"
        fi
        echo "will pull rotate image $IMAGE"
        ssh -o StrictHostKeyChecking=no root@$ADDR docker run --privileged=true  -v /:/alicoud-k8s-host --pid host --net host \
                 $IMAGE /renew/upgrade-k8s.sh --role master
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr image pull $IMAGE
        ssh -o StrictHostKeyChecking=no root@$ADDR ctr run --privileged=true --mount type=bind,src=/,dst=/alicoud-k8s-host,options=rbind:rw \
                --net-host $IMAGE cert-rotate /renew/upgrade-k8s.sh --role master
        set -e
        echo "finished renew k8s components cert on $ADDR"
      done
    }
    
    function check_container_runtime() {
      if command -v dockerd &>/dev/null && ps aux | grep -q "[d]ockerd"; then
        cmd=docker
      elif command -v containerd &>/dev/null && ps aux | grep -q "[c]ontainerd"; then
        cmd=crictl
      else
        echo "Neither Dockerd nor Containerd is installed or running."
        exit 1
      fi
    }
    
    while
        [[ $# -gt 0 ]]
    do
        key="$1"
    
        case $key in
        --region)
          export REGION=$2
          shift
          ;;
        *)
          echo "unknown option [$key]"
          exit 1
          ;;
        esac
        shift
    done
    
    get_etcdhosts
    printf "ETCD_HOSTS: %s\n" "$ETCD_HOSTS"
    
    gencerts
    echo "---generate certificates successfully---"
    
    rotate_etcd
    echo "---rotate etcd successfully---"
    
    echo "---check etcd cluster ready---"
    check_etcd_cluster_ready
    
    rotate_apiserver
    echo "---restart apiserver successfully---"
    
    echo "---renew k8s components certs---"
    renew_k8s_certs
    echo "---end to renew k8s components certs---"
    
    generate_cm
    echo "etcd CA and certs have successfully rotated!"
    
    rm -rf $dir
  3. 在任意Master节点上运行bash rollback-etcd.sh

    当看到命令行输出etcd CA and certs have successfully rotated!时,表示所有Master节点上的证书和K8s证书已经轮转完成。

  4. 验证证书是否更新。

cd /var/lib/etcd/cert
for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done


cd /etc/kubernetes/pki/etcd
for i in `ls | grep pem| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done


cd /etc/kubernetes/pki/
for i in `ls | grep crt| grep -v key`;do openssl x509 -noout -text -in $i | grep -i after && echo "$i" ;done
说明

当以上脚本输出的时间在50年之后,表示轮转完成。