集群调度
ADP底座提供了集群整体的精细化调度能力,可以在一个配置文件中,以全局的视角,根据不同业务场景(例如中间件、核心业务应用、非核心业务应用等)筛选不同的Workload,然后配置统一的调度策略和资源隔离策略等。
Spec定义
Workload 发布分布调度策略 - WorkloadSpread
针对 POD workload patch.spec.affinity.*/.spec.topologySpreadConstraints[]/.spec.tolerations[]
,主要可以解决全局通配调度能力而免去每个独立应用个别的配置,在 Helm Charts 情况下如果模板内容不具备适配能力将照成一定程度的改造工作,利用WorkloadSpread API 配置
将可进行调度条件补充。
# API Version Info (namespaced=false)
apiVersion: opcc.cnx.aliyun-inc.com/v1alpha1
kind: WorkloadSpread
metadata:
name: <workloadSpreadName>
spec:
spreadGroups:
- name: group-a
targets:
# targetRef - 指的负载类型,valid GVK are apps/v1/[StatefulSet,Deployment,ReplicaSetSpec] or CustomResourceDefinition that has assodicated workloads,apps/v1/[StatefulSet,Deployment,ReplicaSetSpec] have ownerReferences info will be prosecuted to its owner resource and will not be used to match label selector
- targetRef:
- apiGroup: apps
apiVerion: v1
kind: StatefulSet
- apiGroup: apps
apiVerion: v1
kind: Deployment
# labelSelector - 标签赛选(必填)ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#resources-that-support-set-based-requirements
labelSelector:
matchLabels:
some-res-label: some-res-label-value
matchExpressions:
- key: another-node-label-key
# operator - Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
operator: In
values:
- another-node-label-value
# namespaceSelector - 额外 Namespace selector 筛选
namespaceSelector:
matchNames:
- default
affinity:
# nodeAffinity - 节点亲和匹配配置
# ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity
nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution - 强制匹配调度配置
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
# operator - Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
operator: In
values:
- zone-a
# preferredDuringSchedulingIgnoredDuringExecution - 尽量匹配调度配置
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
# operator - Represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
operator: In
values:
- another-node-label-value
# podAffinity -
# ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity
podAffinity:
# podAntiAffinity -
# https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity
podAntiAffinity:
# topologySpreadConstraints - Spread Constraints for Pods
# ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#spread-constraints-for-pods
topologySpreadConstraints:
- maxSkew: <integer>
topologyKey: <string>
whenUnsatisfiable: <string>
labelSelector: <object>
# toleration -
# ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#concepts
tolerations:
- key: "example-key"
operator: "Exists"
effect: "NoSchedule"
- name: group-b
targets:
# ...
# required operator controller
status:
observedGeneration:
pendingResources:
- <ns>/statefulset/zzz
upToDateResources:
- <ns>/statefulset/xxx
- <ns>/deployment/yyy
使用说明
POD
spec 改写行为说明
针对
Deployment
、ReplicaSet
将会保障POD
spec 被WorkloadSpread
spec 内容 patch。针对
StatefulSet
如果POD
已经被 scheduled 那WorkloadSpread
spec 内容将不对 `POD` spec 进行任何改变,也就可以理解为初次有效或WorkloadSpread
为后置配置将无法生效。
多个WorkloadSpread
CR 或.spec.spreadGroups[]
target selector 出现冲突、重复
多个
WorkloadSpread
CR 有可能产生 target 重复且照成条件冲突,将采用.metadata.creationTimestamp
为较新。.spec.spreadGroups[]
有可能产生 target 重复且照成条件冲突,将采用第一个匹配配置策略(first find)。建议只配置一个
WorkloadSpread
这样方便审视 targets 塞选条件。
FAQ
Q:如何确保WorkloadSpread API
对象配置在集群内的StatefulSet
workload 创建前?
A:可以在 ADP 应用编排过程配置底座 OPCC 组件 Helm valuespreadGroups
数组内容, 数组内容将会完全渲染在系统缺省的WorkloadSpread
对象里的.spec.spreadGroups
下。
Q:WorkloadSpread API
对象配置是在集群内的StatefulSet
workload 创建后创建,如何迁移 PODs?
A:迁移 PODs 的条件是绑定的 PVC 是可以节点漂移的,但如果用的是缺省的 Yoda-LVM/OpenLocal-LVM Storage Class 那只能评估 PV 文件系统的内容否可以丢弃,如果可以的条件下删除对应的 PVC 及 POD。