组件接入最佳实践

更新时间: 2022-07-14 15:12:03

介绍组件接入的最佳实践。

版本号

版本号规范

版本格式:主版本号.次版本号.修订号,版本号递增规则如下:

  • 主版本号:当你做了不兼容的 API 修改

  • 次版本号:当你做了向下兼容的功能性新增

  • 修订号:当你做了向下兼容的问题修正

先行版本号及版本编译信息可以加到“主版本号.次版本号.修订号”的后面,作为延伸。如:组件镜像还在调试中预发版本号可以定义为(x.y.z-beta.1)

版本号映射

ADP 组件包含两个版本号,组件版本号和应用版本号。

对应关系见示例:

# Helm chart Chart.yaml
apiVersion: v1
version: 0.1.0  #组件版本号
description: cn app cn-app-events
name: cn-app-events
appVersion: 1.0.0 #应用版本号

Chart on ADP 最佳实践

Chart基本要求

  1. 组件中心的组件 chart 需要能够通过 helm v3 安装

  2. 使用 helm v3 安装时,chart 应当能够通过指定不同 release name 进行多次安装且不出现资源冲突

  3. 使用 helm v3 安装时,chart 应当能够通过指定不同 namespace 进行安装,且安装 chart 所创建的资源应该与指定的 namespace 相匹配

配置示例

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: {{ include "minio.fullname" . }} 
  labels:
    app: {{ include "minio.fullname" . }}
    component: minio
    {{- include "minio.labels" . | nindent 4 }}
# 1、所有资源的name都是变量,一般和releaseName保持一致或增加前缀或者后缀 xxxx-{{ include "minio.fullname" . }}
# 2、所有的资源的namespace字段留空。

charts 文件结构

基本结构

myapp
├── Chart.yaml
├── templates
└── values.yaml

如果组件由多个模块或者应用组成,基于不同场景可以考虑两种不同的组织方式。

方式一:每个应用(模块)作为一个chart(适用于管控面与数据面分离组件)

charts
├── mysql
│   ├── Chart.yaml
│   ├── templates
│   └── values.yaml
└── mysql-operator
    ├── Chart.yaml
    ├── templates
    └── values.yaml

方式二:一个chart由多个模块组成

kube-prometheus-stack
├── templates
│   ├── prometheus
│   ├── alertmanager
│   ├── grafana
│   ├── prometheus-operator
│   └── exporter
├── crds
│    ├── crd0.yaml
│    └── crd1.yaml
├── Chart.yaml
└── values.yaml

存储配置规范

  1. 组件 chart 如果需要使用 pvc,则 pvc 的 .spec.storageClassName 字段请留空,组件交付的底座环境会自动使用底座默认的 storageclass (kubernetes 的默认机制)

    1. adp 底座默认的 storageclass 为 "yoda-lvm-default"

  2. 对于磁盘有特殊需求的中间件,建议采用以下方式

    1. 提交的 chart 中使用的 storageclass 指定为"",从而可以适配任何设置了默认 storageclass 的底座

    2. 需要独占块设备/文件系统的中间件,需要在文档中注明所依赖的具体 storageclass 类型,并描述如何通过 values.yaml 配置 storageclass

    示例

## statefulset.yaml

{{ $storageClass := .Values.persistence.storageClass }}
volumeClaimTemplates:
    - metadata:
        name: export-1
      spec:
        accessModes: [ {{ .Values.persistence.accessMode }} ]
        {{- if $storageClass }}
        storageClassName: {{ $storageClass }}
        {{- end }}
        resources:
          requests:
            storage: {{ .Values.persistence.size }}
 
 # values.yaml
 persistence:
  enabled: true
  #storageClass: default
  VolumeName: cluster-minio
  accessMode: ReadWriteOnce
  size: 5Gi
  subPath: ""

存活探针

对于每个container都配置了存活探针(spec.containers[].livenessProbe)

# Source: mysql-operator/templates/deployment.yaml
apiVersion: apps/v1
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: mysql-operator
          securityContext:
            {}
          image: "harbor.middleware.com/middleware/mysql-operator:v1.0.18"
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10880
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 5

就绪探针

对于每个container都配置了就绪探针(spec.containers[].readinessProbe)

# Source: mysql-operator/templates/deployment.yaml
apiVersion: apps/v1
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: mysql-operator
          securityContext:
            {}
          image: "harbor.middleware.com/middleware/mysql-operator:v1.0.18"
          command:
          - ./mysql-operator
          - --v=5
          - --leader-elect=true
          - --port=10880
          - --operatorname=mysql-operator
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10880
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 5

Pod Request/Limit

对于每个container都配置了Pod Request/Limit (spec.containers[].resources.limits)

# Source: mysql-operator/templates/deployment.yaml
apiVersion: apps/v1
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: mysql-operator
          securityContext:
            {}
          image: "harbor.middleware.com/middleware/mysql-operator:v1.0.18"
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
          resources:
            limits:
              cpu: 200m
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 256Mi

Metrics 接入、告警接入、监控大盘接入详见 监控告警

日志接入以及日志告警接入详见 日志服务

CRD 升级

Helm 原生只在首次安装时安装CRD,[Helm CRD 最佳实践](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/) 方法二提到用独立 chart 但这样也有一定的操作难度,以下给出一个利用jobs.batchHelm template 模板参考:

方法1 - 把 crds/*.yaml 文件放在 ConfigMap 并用 Job 安装:

{{- if .Values.upgradeCRDs.enabled }}
# Helm chart CRD yaml 保存在 <chart>/crds/ ,以下把所有 CRD yaml 写入 ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "myapp.fullname" . }}-crds
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
data:
{{ (.Files.Glob "crds/*.yaml").AsConfig | indent 2 }}

---
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ template "myapp.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
spec:
  backoffLimit: 3
  template:
    metadata:
      name: {{ template "myapp.fullname" . }}-upgrade-crds
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
        - name: kubectl 
          image: {{ .Values.upgradeCRDs.image.repoTag }}
          imagePullPolicy: {{ .Values.upgradeCRDs.image.pullPolicy }}
          command:
            - /bin/sh
            - -c
            - for f in /crds/*.yaml ; do kubectl replace -f $f ; done
          resources:
{{ toYaml .Values.upgradeCRDs.resources | nindent 12 }}
          volumeMounts:
            # - name: kubeconfig
            #   mountPath: /root/.kube/config
            #   readOnly: true
            - name: crds
              mountPath: /crds
              readOnly: true
      volumes:
        # 依赖了 master node's `/root/.kube/config`,做法有点 HACK,如果要
        #   不依赖需要额外加上 ServiceAccount 配置管理
        # - name: kubeconfig
        #   hostPath:
        #     path: /root/.kube/config
        - name: crds
          configMap:
            name: {{ include "myapp.fullname" . }}-crds
      serviceAccountName: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
      restartPolicy: OnFailure
      {{- with .Values.upgradeCRDs.nodeSelector }}
      nodeSelector:
{{ toYaml . | nindent 8 }}
      {{- else }}
        "node-role.kubernetes.io/master": ""
      {{- end }}
      {{- with .Values.upgradeCRDs.affinity }}
      affinity:
{{ toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.upgradeCRDs.tolerations }}
      tolerations:
{{ toYaml . | nindent 8 }}
      {{- end }}
      securityContext:
        # runAsGroup: 2000
        runAsNonRoot: false
        # runAsUser: 2000
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
subjects:
  - kind: ServiceAccount
    namespace: {{ .Release.namespace }}
    name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
{{- end }}
### values.yaml
upgradeCRDs:
  enabled: true
  image:
    # 以下为 kubectl v1.22 镜像,可以自行选择替换所需要的 kubectl 镜像,但需要确保
    #  `/bin/sh` 也存在
    repoTag: registry.cn-shanghai.aliyuncs.com/cnx-platform/cn-opcc:kubectl-1.22.2-r0
    pullPolicy: IfNotPresent
    resources: {}
  nodeSelector: {}
    # "node-role.kubernetes.io/master": ""
  affinity: {}
  tolerations: []

优势:

  • 简单

缺陷:

  • 如果 CRD 文件内容较大、多有可能导致超过最大的 ConfigMap 对象可以放的内容 (1MB),对此状况请参考方法2

方法2 - 用 InitContainer 带上 CRD yaml 文件,并用 Job 安装:

{{- if not (empty .Values.upgradeCRDs) }}
{{- if .Values.upgradeCRDs.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
  labels:
    {{- include "kube-prometheus-stack.labels" . | nindent 4 }}
spec:
  backoffLimit: 3
  template:
    metadata:
      name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      initContainers:
        - name: crds
          image: {{ .Values.upgradeCRDs.crdResourcesImage.repoTag }}
          imagePullPolicy: {{ .Values.upgradeCRDs.crdResourcesImage.pullPolicy }}
          command:
            - /bin/sh
            - -c
            - cp /crds/* /tmp/crds/
          volumeMounts:
            - mountPath: /tmp/crds
              name: crds
      containers:
        - name: kubectl
          image: {{ .Values.upgradeCRDs.image.repoTag }}
          imagePullPolicy: {{ .Values.upgradeCRDs.image.pullPolicy }}
          command:
            - /bin/sh
            - -c
            - for f in /crds/*.yaml ; do kubectl replace -f $f ; done
          resources:
{{ toYaml .Values.upgradeCRDs.kubectlImage.resources | nindent 12 }}
          volumeMounts:
            # - name: kubeconfig
            #   mountPath: /root/.kube/config
            #   readOnly: true
            - mountPath: /crds
              name: crds
              readOnly: true
      volumes:
        # - name: kubeconfig
        #   hostPath:
        #     path: /root/.kube/config
        - name: crds
          emptyDir: {}
      serviceAccountName: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
      restartPolicy: OnFailure
      {{- with .Values.upgradeCRDs.nodeSelector }}
      nodeSelector:
{{ toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.upgradeCRDs.affinity }}
      affinity:
{{ toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.upgradeCRDs.tolerations }}
      tolerations:
{{ toYaml . | nindent 8 }}
      {{- end }}
      securityContext:
        # runAsGroup: 2000
        runAsNonRoot: false
        # runAsUser: 2000
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
  annotations:
    "helm.sh/hook": post-install,post-upgrade,post-rollback
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
subjects:
  - kind: ServiceAccount
    namespace: {{ .Release.namespace }}
    name: {{ template "kube-prometheus-stack.fullname" . }}-upgrade-crds
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}

### values.yaml
upgradeCRDs:
  enabled: true
  image:
    # 以下为 kubectl v1.22 镜像,可以自行选择替换所需要的 kubectl 镜像,但需要确保
    #  `/bin/sh` 也存在
    repoTag: registry.cn-shanghai.aliyuncs.com/cnx-platform/cn-opcc:kubectl-1.22.2-r0
    pullPolicy: IfNotPresent
    resources: {}
  crdResourcesImage:
    # 提供 CRD 资源 yaml 的镜像,放在 /crds/ 文件夹下
    repoTag: <CRD image>
    pullPolicy: IfNotPresent
    resources: {}
  nodeSelector: {}
    # "node-role.kubernetes.io/master": ""
  affinity: {}
  tolerations: []
上一篇: 合理利用存储备份保障业务数据可还原 下一篇: 服务支持
阿里云首页 云原生应用交付平台 相关技术圈