数据缓存亲和性调度优化

通过Fluid提供的数据缓存亲和性调度优化能力,您可以设置应用Pod上的亲和性配置,让应用Pod优先访问同节点、同可用区节点或同地域节点的缓存数据,从而提高应用Pod访问数据的效率。

使用限制

前提条件

功能介绍

基于Mutating Webhook机制,Fluid可以为应用Pod注入所需数据缓存的亲和性信息。当Kubernetes调度器调度应用Pod时,可以优先选择有缓存数据的节点,或与缓存数据位于相同可用区或地域的其他节点,自动实现应用Pod与数据缓存之间的分层亲和性调度。

重要

如果应用Pod在spec.affinityspec.nodeSelector中自定义了与分层拓扑信息相关的亲和性信息,此时以应用Pod自身配置为准,Fluid不会注入相关的亲和性调度配置信息。

调度策略配置

默认配置

Fluid默认提供了节点、可用区和地域三个层级的调度策略标识。您可以通过以下命令查看当前集群中的调度策略配置。

kubectl get cm -n fluid-system webhook-plugins -oyaml

预期输出结果:

apiVersion: v1
data:
  pluginsProfile: |
    pluginConfig:
    - args: |
        preferred:
          # fluid existed node affinity, the name can not be modified.
          - name: fluid.io/node
            weight: 100
          # runtime worker's zone label name, can be changed according to k8s environment.
          - name: topology.kubernetes.io/zone
            weight: 50
          # runtime worker's region label name, can be changed according to k8s environment.
          - name: topology.kubernetes.io/region
            weight: 20
        # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
        required:
          - fluid.io/node
      name: NodeAffinityWithCache
    plugins:
      serverful:
        withDataset:
        - RequireNodeWithFuse
        - NodeAffinityWithCache
        - MountPropagationInjector
        withoutDataset:
        - PreferNodesWithoutCache
      serverless:
        withDataset:
        - FuseSidecar
        withoutDataset: []

上述ConfigMap资源定义的pluginsProfile字段中参数解释:

参数

解释

fluid.io/node

Fluid预留的标识字段。启用该字段后,Fluid会自动为应用Pod注入与缓存数据在同一节点的优先亲和性语义,语义权重为100。

topology.kubernetes.io/zone

Kubernetes集群中标识所在可用区的通用字段。启用该字段后,Fluid自动为应用Pod注入与缓存数据在同一可用区的优先亲和性语义,语义权重为50。

topology.kubernetes.io/region

Kubernetes集群中标识所在地域的通用字段。启用该字段后,Fluid自动为应用Pod注入与缓存数据在同一地域的优先亲和性语义,语义权重为20。

自定义配置

如果Kubernetes集群中存在与其他集群节点拓扑结构相关的节点标签,并且希望Fluid根据此类节点标签自动注入亲和性信息,可以执行以下操作完成相关配置。

  1. 执行以下命令,更新ConfigMap内容。

    kubectl edit -n fluid-system cm webhook-plugins
  2. 将ConfigMap修改为如下内容。

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
              - name: fluid.io/node
                weight: 100
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
              - name: <topology_key>
                weight: <topology_weight>
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []

    参考配置示例一:忽略节点级别亲和性

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
    -         #- name: fluid.io/node
    -         #  weight: 100
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []

    参考配置示例二:添加集群节点池级别亲和性

    apiVersion: v1
    data:
      pluginsProfile: |
        pluginConfig:
        - args: |
            preferred:
              # fluid existed node affinity, the name can not be modified.
              - name: fluid.io/node
                weight: 100
    +         - name: alibabacloud.com/nodepool-id
    +           weight: 80
              # runtime worker's zone label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/zone
                weight: 50
              # runtime worker's region label name, can be changed according to k8s environment.
              - name: topology.kubernetes.io/region
                weight: 20
            # used when app pod with label fluid.io/dataset.{dataset name}.sched set true
            required:
              - fluid.io/node
          name: NodeAffinityWithCache
        plugins:
          serverful:
            withDataset:
            - RequireNodeWithFuse
            - NodeAffinityWithCache
            - MountPropagationInjector
            withoutDataset:
            - PreferNodesWithoutCache
          serverless:
            withDataset:
            - FuseSidecar
            withoutDataset: []
  3. 执行以下命令,重启Fluid Webhook组件,使上述配置生效。

    kubectl rollout restart deployment -n fluid-system fluid-webhook

操作示例

示例一:数据缓存节点亲和性调度

  1. 创建Secret。

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. 创建Dataset和Runtime资源。

    重要

    该章节以JindoRuntime为例讲解,如果需要使用其他Runtime作为缓存运行时,请参见EFC加速NAS或CPFS文件访问。如需了解JindoFS如何加速OSS文件访问,请参见JindoFS加速OSS文件访问

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"
  3. 创建应用Pod。

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    与本文档介绍功能相关的关键参数说明如下:

    参数

    说明

    fuse.serverful.fluid.io/inject: "true"

    该Pod需要由Fluid自动注入数据缓存亲和性相关配置信息的标识。

    claimName

    该Pod声明挂载的PVC。该PVC由Fluid自动创建,与Dataset同名。

  4. 查看应用Pod中注入的亲和性调度信息。

    kubectl get pod nginx -oyaml

    预期输出:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"
            weight: 100
    

    应用Pod Spec中自动注入了数据缓存节点的亲和性(fluid.io/s-default-demo-dataset),其权重取决于在调度策略配置中对于不同节点拓扑字段的配置。

示例二:数据缓存同可用区亲和性调度

  1. 创建Secret。

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. 创建Dataset和Runtime资源。

    重要

    该章节以JindoRuntime为例讲解,如果需要使用其他Runtime作为缓存运行时,请参见EFC加速NAS或CPFS文件访问。如需了解JindoFS如何加速OSS文件访问,请参见JindoFS加速OSS文件访问

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: topology.kubernetes.io/zone
                  operator: In
                  values:
                  - "<ZONE_ID>" # e.g. cn-beijing-i
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      master:
        nodeSelector:
          topology.kubernetes.io/zone: <ZONE_ID> # e.g. cn-beijing-i
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"

    如果需要启用数据缓存同可用区亲和性调度能力,需要在Dataset中显式定义Dataset数据缓存所在的可用区。例如,在nodeAffinity.required.nodeSelectorTerms中,本示例代码定义了数据集的可用区标签topology.kubernetes.io/zone=cn-beijing-i

  3. 创建应用Pod。

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    与本文档介绍功能相关的关键参数说明如下:

    参数

    说明

    fuse.serverful.fluid.io/inject: "true"

    该Pod需要由Fluid自动注入数据缓存亲和性相关配置信息的标识。

    claimName

    该Pod声明挂载的PVC。该PVC由Fluid自动创建,与Dataset同名。

  4. 查看应用Pod中注入的亲和性调度信息。

    kubectl get pod nginx -oyaml

    预期输出:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"
            weight: 100
          - preference:
              matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values:
                - <ZONE_ID> # e.g. cn-beijing-i
            weight: 50
    ...

    应用Pod Spec中自动注入了数据缓存节点的亲和性(fluid.io/s-default-demo-dataset)和可用区亲和性(topology.kubernetes.io/zone),其权重取决于在调度策略配置中对于不同节点拓扑字段的配置。

示例三:数据缓存节点强制亲和性调度

  1. 创建Secret。

    apiVersion: v1
    kind: Secret
    metadata:
      name: mysecret
    stringData:
      fs.oss.accessKeyId: <ACCESS_KEY_ID>
      fs.oss.accessKeySecret: <ACCESS_KEY_SECRET>
  2. 创建Dataset和Runtime资源。

    重要

    该章节以JindoRuntime为例讲解,如果需要使用其他Runtime作为缓存运行时,请参见EFC加速NAS或CPFS文件访问。如需了解JindoFS如何加速OSS文件访问,请参见JindoFS加速OSS文件访问

    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: demo-dataset
    spec:
      mounts:
        - mountPoint: oss://<oss_bucket>/<bucket_dir>
          options:
            fs.oss.endpoint: <oss_endpoint>
          name: hadoop
          path: "/"
          encryptOptions:
            - name: fs.oss.accessKeyId
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeyId
            - name: fs.oss.accessKeySecret
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: demo-dataset
    spec:
      replicas: 2
      tieredstore:
        levels:
          - mediumtype: MEM
            path: /dev/shm
            quota: 10G
            high: "0.99"
            low: "0.8"
  3. 创建应用Pod。

    apiVersion: v1
    kind: Pod
    metadata:
      name: nginx
      labels:
        fuse.serverful.fluid.io/inject: "true"
        fluid.io/dataset.demo-dataset.sched: required
    spec:
      containers:
        - name: nginx
          image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
          volumeMounts:
            - mountPath: /data
              name: data-vol
      volumes:
        - name: data-vol
          persistentVolumeClaim:
            claimName: demo-dataset

    与本文档介绍功能相关的关键参数说明如下:

    参数

    说明

    fuse.serverful.fluid.io/inject: "true"

    该Pod需要由Fluid自动注入数据缓存亲和性相关配置信息的标识。

    fluid.io/dataset.<dataset_name>.sched: required

    该Pod需要注入名为<dataset_name>的数据缓存的强制亲和性信息的标识。

    claimName

    该Pod声明挂载的PVC。该PVC由Fluid自动创建,与Dataset同名。

  4. 查看应用Pod中注入的亲和性调度信息。

    kubectl get pod nginx -oyaml

    预期输出:

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        fluid.io/dataset.demo-dataset.sched: required
        fuse.serverful.fluid.io/inject: "true"
      name: nginx
      namespace: default
      ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: fluid.io/s-default-demo-dataset
                operator: In
                values:
                - "true"

    应用Pod Spec中自动注入数据缓存节点的强制亲和性(fluid.io/s-default-demo-dataset)。