基于Gateway with Inference Extension实现全局限流

Gateway with Inference Extension组件支持为集群开启全局限流,确保系统在高并发或异常流量下保持稳定运行。本文介绍如何基于Gateway with Inference Extension组件配置全局限流以及支持的限流场景。

功能说明

限流是一种限制发送到服务端的请求数量的机制。它指定客户端在给定时间段内可以向服务端发送的最大请求数,通常表示为一段时间内的请求数,例如每分钟300个请求或每秒10个请求等。Gateway with Inference Extension组件在开启全局限流之后,会自动部署一个全局限流服务。该全局限流服务负责集中管理并动态提供全局的限流策略与实时流量数据。Gateway with Inference Extension通过内置的限流过滤器(如Rate Limit Filter)与全局限流服务进行交互,实时获取预设的限流阈值(例如每秒请求数或并发连接数),并基于这些策略对传入的请求进行速率限制。

前提条件

操作步骤

步骤一:开启全局限流

全局限流自动部署的限流服务依赖一个Redis服务作为全局存储,本文采用自建Redis服务的方式。您也可以使用云数据库 Tair(兼容 Redis)(Tair (Redis OSS-compatible))来快速创建Redis实例,并将相关配置信息更新到envoy-gateway-system命名空间下的ack-gateway-config配置项中。相关配置说明,请参见Envoy Gateway

  1. 创建redis-service.yaml。

    kind: Namespace
    apiVersion: v1
    metadata:
      name: redis-system
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: redis
      namespace: redis-system
      labels:
        app: redis
    spec:
      serviceName: "redis"
      replicas: 1
      selector:
        matchLabels:
          app: redis
      template:
        metadata:
          labels:
            app: redis
        spec:
          containers:
            - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:6.0.6-for-ack-gateway
              name: redis
              ports:
                - containerPort: 6379
              resources:
                limits:
                  cpu: 1500m
                  memory: 512Mi
                requests:
                  cpu: 200m
                  memory: 256Mi
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: redis
      namespace: redis-system
      labels:
        app: redis
    spec:
      ports:
        - name: redis
          port: 6379
          protocol: TCP
          targetPort: 6379
      selector:
        app: redis
    
  2. 创建enable-global-rate-limit.yaml。

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-gateway-config
      namespace: envoy-gateway-system
    data:
      ack-gateway.yaml: |
        apiVersion: gateway.envoyproxy.io/v1alpha1
        kind: EnvoyGateway
        rateLimit:
          backend:
            type: Redis
            redis:
              url: redis.redis-system.svc.cluster.local:6379
  3. 部署Redis服务并开启全局限流。

    kubectl apply -f redis-service.yaml
    kubectl apply -f enable-global-rate-limit.yaml

步骤二:部署HTTPRoute资源

为后续测试创建HTTPRoute资源,后续的限流规则将会应用到此资源上。

  1. 创建httproute.yaml。

    ---
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: http-ratelimit
    spec:
      parentRefs:
      - name: eg
      hostnames:
      - ratelimit.example 
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
    
  2. 部署HTTPRoute资源。

    kubectl apply -f httproute.yaml
  3. 获取Gateway的公网IP。

    export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

步骤三:场景演示

对指定用户进行限流

配置全局限流规则,限制请求头x-user-id值为one的请求每小时只能有3次访问。

  1. 创建backendtrafficpolicy.yaml。

    apiVersion: gateway.envoyproxy.io/v1alpha1
    kind: BackendTrafficPolicy 
    metadata:
      name: policy-httproute
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: HTTPRoute
        name: http-ratelimit
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - headers:
              - name: x-user-id
                value: one
            limit:
              requests: 3
              unit: Hour
  2. 部署限流规则。

    kubectl apply -f backendtrafficpolicy.yaml
  3. 测试带有x-user-id: one请求头的请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 731                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 730                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 728                                                                                                                                                     
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 727                                                                                                                                                     
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次请求返回了200,第4次请求返回429,说明限流规则限制了带有x-user-id: one请求头的请求。

  4. 测试带有x-user-id: two请求头的请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:11 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:12 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:14 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:15 GMT
    content-length: 504

    可以看到,4次请求返回都是200,说明限流规则没有对带有x-user-id: two请求头的请求进行限制。

对除管理员之外的其他用户进行分别限流

更新全局限流规则,对请求头x-user-id值为admin的请求不限流,对其他请求头值的请求限制每小时只能有3次访问。

  1. 编辑限流规则。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下内容更新限流规则。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - headers:
              - type: Distinct
                name: x-user-id
              - name: x-user-id
                value: admin
                invert: true
            limit:
              requests: 3
              unit: Hour

    保存并退出后,限流规则即时生效。

  2. 测试带有x-user-id: one请求头的请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 731                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 730                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 728                                                                                                                                                     
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 727                                                                                                                                                     
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次请求返回了200,第4次请求返回429,说明限流规则限制了带有x-user-id: one请求头的请求。

  3. 测试带有x-user-id: two请求头的请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:38 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 2
    x-ratelimit-reset: 382
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:39 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 1
    x-ratelimit-reset: 381
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:41 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 0
    x-ratelimit-reset: 379
    
    HTTP/1.1 429 Too Many Requests
    x-envoy-ratelimited: true
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 0
    x-ratelimit-reset: 378
    date: Tue, 27 May 2025 07:53:41 GMT
    transfer-encoding: chunked

    可以看到,前3次请求返回了200,第4次请求返回429,说明限流规则限制了带有x-user-id: two请求头的请求。

  4. 测试带有x-user-id: admin请求头的请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: admin" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:44 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:45 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:46 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:47 GMT
    content-length: 506

    可以看到,4次请求返回都是200,说明限流规则没有对带有x-user-id: admin请求头的请求进行限制。

限制所有请求

更新全局限流规则,使所有请求每小时只能有3次访问。

  1. 编辑限流规则。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下内容更新限流规则。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - limit:
              requests: 3
              unit: Hour

    保存并退出后,限流规则即时生效。

  2. 测试普通请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 3427                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 3425                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3424                                                                                                                                                    
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3423                                                                                                                                                    
    date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次请求返回了200,第4次请求返回429,说明限流规则已经生效。

根据客户端IP进行限流

更新全局限流规则,对一个IP段内的每个IP进行分别限制每小时只能有3次访问。

说明

为了方便演示,本场景限制的IP网段为0.0.0.0/0,您可以根据实际情况进行调整。

  1. 编辑限流规则。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下内容更新限流规则。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - sourceCIDR: 
                value: 0.0.0.0/0
                type: Distinct
            limit:
              requests: 3
              unit: Hour

    保存并退出后,限流规则即时生效。

  2. 测试普通请求限流情况。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

    预期输出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 3427                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 3425                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3424                                                                                                                                                    
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3423                                                                                                                                                    
    date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次请求返回了200,第4次请求返回429,说明限流规则对0.0.0.0/0的限流已经生效。

(可选)步骤四:清理测试资源

  1. 清理限流规则。

    kubectl delete BackendTrafficPolicy policy-httproute
  2. 清理其他资源。

    kubectl delete -f httproute.yaml
    kubectl delete -f redis-service.yaml
    kubectl delete -f enable-global-rate-limit.yaml