基于MSE实现微服务应用无损上下线

为了避免在发布过程中造成流量损失,许多高并发请求量的应用系统通常选择在流量较小时进行发布。虽然这种方法可以一定程度上解决问题,但不可控因素引起的运维成本也很高。MSE针对应用系统的上线与下线流程进行了深度优化。在应用下线阶段,MSE引入了自适应等待和主动通知机制,确保所有待处理请求完成后再执行下线操作,从而避免突兀的服务中断;在应用上线阶段,通过就绪检查,并将微服务的生命周期管理与发布的各个阶段精准对齐,有效保证了新版本服务的稳定启动。

前提条件

Demo架构说明

假设应用的架构由Zuul网关以及后端的微服务应用实例(Spring Cloud)构成,具体的后端调用链路有购物车应用A,交易中心应用B,库存中心应用C,这些应用中的服务之间通过Nacos注册中心实现服务注册与发现。

在spring-cloud-zuul应用中,如下图所示,其分别向spring-cloud-a的灰度版本和正常版本以QPS为100的速率同时进行服务调用。应用部署流量架构图

部署应用并接入微服务治理

重要

由于Demo中有CronHPA任务,所以请先在集群中安装ack-kubernetes-cronhpa-controller组件。具体操作,请参见步骤一:安装CronHPA组件

  1. 登录容器服务管理控制台,在左侧导航栏选择集群

  2. 集群列表页面,单击目标集群名称,然后在左侧导航栏,选择工作负载 > 无状态

  3. 无状态页面,单击使用YAML创建资源

  4. 示例模板选择自定义模板内容使用如下YAML,然后单击创建

    本示例Demo文件取名为mse-demo.yaml。部署Zuul网关和A、B、C三个应用,其中A、B两个应用分别部署一个基线版本和一个灰度版本,B应用的基线版本关闭了无损下线能力,灰度版本开启了无损下线能力,C应用开启了服务预热能力,其中预热时长为120秒。

    展开查看YAML文件

    # Nacos Server
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: nacos-server
      name: nacos-server
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nacos-server
      template:
        metadata:
          labels:
            app: nacos-server
            msePilotCreateAppName: nacos-server
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: MODE
              value: standalone
            image: registry.cn-shanghai.aliyuncs.com/yizhan/nacos-server:latest
            imagePullPolicy: Always
            name: nacos-server
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
          dnsPolicy: ClusterFirst
          restartPolicy: Always
    
    # Nacos Server Service配置
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nacos-server
    spec:
      ports:
      - port: 8848
        protocol: TCP
        targetPort: 8848
      selector:
        app: nacos-server
      type: ClusterIP
    
    #入口Zuul应用
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: spring-cloud-zuul
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: spring-cloud-zuul
      template:
        metadata:
          labels:
            app: spring-cloud-zuul
            msePilotCreateAppName: spring-cloud-zuul
            msePilotAutoEnable: "on"
        spec:
          containers:
            - env:
                - name: JAVA_HOME
                  value: /usr/lib/jvm/java-1.8-openjdk/jre
                - name: LANG
                  value: C.UTF-8
              image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-zuul:1.0.1
              imagePullPolicy: Always
              name: spring-cloud-zuul
              ports:
                - containerPort: 20000
    
    # A应用base版本,开启按照机器纬度全链路透传。
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: spring-cloud-a
      name: spring-cloud-a
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: spring-cloud-a
      template:
        metadata: 
          labels:
            app: spring-cloud-a
            msePilotCreateAppName: spring-cloud-a
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: LANG
              value: C.UTF-8
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: profiler.micro.service.tag.trace.enable
              value: "true"
            image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
            imagePullPolicy: Always
            name: spring-cloud-a
            ports:
            - containerPort: 20001
              protocol: TCP
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
            livenessProbe:
              tcpSocket:
                port: 20001
              initialDelaySeconds: 10
              periodSeconds: 30
    
    # A应用gray版本,开启按照机器纬度全链路透传。
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: spring-cloud-a-gray
      name: spring-cloud-a-gray
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: spring-cloud-a-gray
      strategy:
      template:
        metadata:
          labels:
            alicloud.service.tag: gray
            app: spring-cloud-a-gray
            msePilotCreateAppName: spring-cloud-a
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: LANG
              value: C.UTF-8
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: profiler.micro.service.tag.trace.enable
              value: "true"
            image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
            imagePullPolicy: Always
            name: spring-cloud-a-gray
            ports:
            - containerPort: 20001
              protocol: TCP
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
            livenessProbe:
              tcpSocket:
                port: 20001
              initialDelaySeconds: 10
              periodSeconds: 30
    
    # B应用base版本,关闭无损下线能力。
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: spring-cloud-b
      name: spring-cloud-b
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: spring-cloud-b
      strategy:
      template:
        metadata:
          labels:
            app: spring-cloud-b
            msePilotCreateAppName: spring-cloud-b
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: LANG
              value: C.UTF-8
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            - name: micro.service.shutdown.server.enable
              value: "false"
            - name: profiler.micro.service.http.server.enable
              value: "false"
            image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT
            imagePullPolicy: Always
            name: spring-cloud-b
            ports:
            - containerPort: 8080
              protocol: TCP
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
            livenessProbe:
              tcpSocket:
                port: 20002
              initialDelaySeconds: 10
              periodSeconds: 30
    
    # B应用gray版本,默认开启无损下线功能。
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: spring-cloud-b-gray
      name: spring-cloud-b-gray
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: spring-cloud-b-gray
      template:
        metadata:
          labels:
            alicloud.service.tag: gray
            app: spring-cloud-b-gray
            msePilotCreateAppName: spring-cloud-b
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: LANG
              value: C.UTF-8
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.1-SNAPSHOT
            imagePullPolicy: Always
            name: spring-cloud-b-gray
            ports:
            - containerPort: 8080
              protocol: TCP
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
            livenessProbe:
              tcpSocket:
                port: 20002
              initialDelaySeconds: 10
              periodSeconds: 30
    
    # C应用base版本
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: spring-cloud-c
      name: spring-cloud-c
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: spring-cloud-c
      template:
        metadata:
          labels:
            app: spring-cloud-c
            msePilotCreateAppName: spring-cloud-c
            msePilotAutoEnable: "on"
        spec:
          containers:
          - env:
            - name: LANG
              value: C.UTF-8
            - name: JAVA_HOME
              value: /usr/lib/jvm/java-1.8-openjdk/jre
            image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.1-SNAPSHOT
            imagePullPolicy: Always
            name: spring-cloud-c
            ports:
            - containerPort: 8080
              protocol: TCP
            resources:
              requests:
                cpu: 250m
                memory: 512Mi
            livenessProbe:
              tcpSocket:
                port: 20003
              initialDelaySeconds: 10
              periodSeconds: 30
    
    # HPA配置
    ---
    apiVersion: autoscaling.alibabacloud.com/v1beta1
    kind: CronHorizontalPodAutoscaler
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
      name: spring-cloud-b
    spec:
       scaleTargetRef:
          apiVersion: apps/v1beta2
          kind: Deployment
          name: spring-cloud-b
       jobs:
       - name: "scale-down"
         schedule: "0 0/5 * * * *"
         targetSize: 1
       - name: "scale-up"
         schedule: "10 0/5 * * * *"
         targetSize: 2
    ---
    apiVersion: autoscaling.alibabacloud.com/v1beta1
    kind: CronHorizontalPodAutoscaler
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
      name: spring-cloud-b-gray
    spec:
       scaleTargetRef:
          apiVersion: apps/v1beta2
          kind: Deployment
          name: spring-cloud-b-gray
       jobs:
       - name: "scale-down"
         schedule: "0 0/5 * * * *"
         targetSize: 1
       - name: "scale-up"
         schedule: "10 0/5 * * * *"
         targetSize: 2
    ---
    apiVersion: autoscaling.alibabacloud.com/v1beta1
    kind: CronHorizontalPodAutoscaler
    metadata:
      labels:
        controller-tools.k8s.io: "1.0"
      name: spring-cloud-c
    spec:
       scaleTargetRef:
          apiVersion: apps/v1beta2
          kind: Deployment
          name: spring-cloud-c
       jobs:
       - name: "scale-down"
         schedule: "0 2/5 * * * *"
         targetSize: 1
       - name: "scale-up"
         schedule: "10 2/5 * * * *"
         targetSize: 2
    
    
    # Zuul网关开启SLB暴露展示页面
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: zuul-slb
    spec:
      ports:
        - port: 80
          protocol: TCP
          targetPort: 20000
      selector:
        app: spring-cloud-zuul
      type: ClusterIP
    
    # A应用暴露K8s Service
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: spring-cloud-a-base
    spec:
      ports:
        - name: http
          port: 20001
          protocol: TCP
          targetPort: 20001
      selector:
        app: spring-cloud-a
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: spring-cloud-a-gray
    spec:
      ports:
        - name: http
          port: 20001
          protocol: TCP
          targetPort: 20001
      selector:
        app: spring-cloud-a-gray
    
    # Nacos Server SLB Service配置
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nacos-slb
    spec:
      ports:
      - port: 8848
        protocol: TCP
        targetPort: 8848
      selector:
        app: nacos-server
      type: LoadBalancer
  5. 将部署应用接入微服务治理。具体操作,请参见ACK和ACS微服务应用接入MSE治理中心(Java版)

查看可观测数据

由于spring-cloud-b应用和spring-cloud-b-gray应用均开启了定时HPA,模拟每5分钟进行一次定时的扩缩容。您可以单击应用名称,然后单击容器伸缩页签进行查看。

  • spring-cloud-b应用:

    image

  • spring-cloud-b-gray应用:

    image

  1. 登录MSE治理中心控制台,并在顶部菜单栏选择地域。

  2. 在左侧导航栏,选择治理中心 > 应用治理,单击spring-cloud-a应用资源卡片。

  3. 应用概览页面显示该应用相关的可观测数据。

    从数据中可以看出:

    • spring-cloud-a-gray版本的流量在Pod扩缩容的过程中请求错误数为0,无流量损失。

    • spring-cloud-a版本由于关闭了无损下线功能,在Pod扩缩容的过程中有20个从spring-cloud-a发到spring-cloud-b的请求出现报错,发生了请求流量损耗。

开启无损上线

在spring-cloud-c应用开启了定时HPA模拟应用启动的过程,每隔5分钟做一次伸缩,在第2分钟第0秒缩容到1个节点,在第2分钟第10秒扩容到2个节点。image

  1. 登录MSE治理中心控制台,并在顶部菜单栏选择地域。

  2. 在左侧导航栏,选择治理中心 > 应用治理,单击spring-cloud-c应用资源卡片。

  3. 在左侧导航栏,选择流量治理,然后选择无损上下线页签,打开无损上线开关,在提示信息对话框,单击确定。预热时长默认设置为120秒。

    开启预热功能的应用重启后的流量会随时间缓慢增加,在一些应用启动过程中需要预建连接池和缓存等资源的慢启动场景,开启服务预热能有效保护应用启动过程中缓存资源有序创建保障应用安全启动并做到流量无损。