使用EnvoyProxy自定义Gateway的副本数和资源用量-容器服务 Kubernetes 版 ACK-阿里云

Gateway with Inference Extension基于Envoy Gateway实现，您可以通过调整EnvoyProxy资源配置来调整实际的Gateway参数，如Service类型、Deployment副本数以及Resources等。本文介绍如何为不同范围的Gateway配置副本数和资源用量。

配置说明

在使用Gateway with Inference Extension组件管理生成式 AI 推理服务时，您需要创建 GatewayClass 和 Gateway 资源。

实际Gateway的运行参数（如副本数、资源用量等）可以通过关联一个 EnvoyProxy 资源来定义。关联方式有两种：

精细化配置 (针对单个Gateway)：将 EnvoyProxy 资源直接关联到指定的 Gateway 资源上，实现对单个Gateway的独立配置。
统一配置 (针对整个Gateway Class)：将 EnvoyProxy 资源关联到 GatewayClass 上。这样，该 GatewayClass 下所有未进行独立配置的 Gateway 都会继承这套统一的资源参数。

重要

上述两种配置若同时存在时，独立配置的Gateway资源参数会优先生效。

指定Gateway配置方式

您可以通过在Gateway资源的infrastructure字段中，引用EnvoyProxy资源来配置Gateway的副本数和资源用量。示例如下：

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg
spec:
  gatewayClassName: eg
  infrastructure:
    parametersRef:
      group: gateway.envoyproxy.io
      kind: EnvoyProxy
      name: custom-proxy-config
  listeners:
    - name: http
      protocol: HTTP
      port: 80
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: custom-proxy-config
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: 2
        container:
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
            limits:
              cpu: 1
              memory: 2Gi

上述示例中，在Gateway资源的infrastructure字段通过parametersRef字段引用了名为custom-proxy-config的EnvoyProxy资源，用于配置副本数和资源用量。

重要

Gateway资源只能引用相同命名空间的EnvoyProxy资源。

指定GatewayClass配置方式

您也可以通过在GatewayClass下引用EnvoyProxy资源，来为所有属于当前GatewayClass的Gateway配置副本数和资源用量。示例如下：

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: eg
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: custom-proxy-config
    namespace: default
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: custom-proxy-config
  namespace: default
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: 2
        container:
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
            limits:
              cpu: 1
              memory: 2Gi

上述示例中，在GatewayClass资源的parametersRef字段引用了名为custom-proxy-config的EnvoyProxy资源，用于配置副本数和资源用量。

EnvoyProxy资源的完整配置及常用字段说明

以下为EnvoyProxy的完整配置示例：

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: custom-proxy-config
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: 2  # 若同时配置了envoyHpa，则不需要配置replicas
        strategy:
          rollingUpdate:
            maxSurge: 2
            maxUnavailable: 1
        pod:
          affinity: ...
          tolerations: ...
          nodeSelector: ...
        container:
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
            limits:
              cpu: 1
              memory: 2Gi
      envoyService:
        annotations:
          key: value
        labels:
          key: value
        type: LoadBalancer
        loadBalancerClass: ...
        externalTrafficPolicy: Cluster # or Local
      envoyHpa:
        minReplicas: 1
        maxReplicas: 10
        metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 80
        - type: Resource
          resource:
            name: memory
            target:
              type: Utilization
              averageUtilization: 80
      envoyPDB:
        minAvailable: 1

重要

若同时配置envoyDeployment和envoyHpa，则envoyDeployment下无需配置replicas。

此处仅列举一些常见的字段。完整的EnvoyProxy资源定义，请参见EnvoyProxy。

字段名	类型	是否必填	说明
envoyDeployment	KubernetesDeploymentSpec	否	自定义Gateway的工作负载配置。
envoyService	KubernetesServiceSpec	否	自定义Gateway的Service配置。
envoyHpa	KubernetesHorizontalPodAutoscalerSpec	否	自定义Gateway的HPA配置。
envoyPDB	KubernetesPodDisruptionBudgetSpec	否	自定义Gateway的PodDisruptionBudget配置。