Gateway with Inference Extension组件支持为集群开启全局限流,确保系统在高并发或异常流量下保持稳定运行。本文介绍如何基于Gateway with Inference Extension组件配置全局限流以及支持的限流场景。
功能说明
限流是一种限制发送到服务端的请求数量的机制。它指定客户端在给定时间段内可以向服务端发送的最大请求数,通常表示为一段时间内的请求数,例如每分钟300个请求或每秒10个请求等。Gateway with Inference Extension组件在开启全局限流之后,会自动部署一个全局限流服务。该全局限流服务负责集中管理并动态提供全局的限流策略与实时流量数据。Gateway with Inference Extension通过内置的限流过滤器(如Rate Limit Filter)与全局限流服务进行交互,实时获取预设的限流阈值(例如每秒请求数或并发连接数),并基于这些策略对传入的请求进行速率限制。
前提条件
已安装Gateway with Inference Extension,且版本不低于1.4.0。
已完成准备工作中的步骤。
操作步骤
步骤一:开启全局限流
全局限流自动部署的限流服务依赖一个Redis服务作为全局存储,本文采用自建Redis服务的方式。您也可以使用云数据库 Tair(兼容 Redis)(Tair (Redis OSS-compatible))来快速创建Redis实例,并将相关配置信息更新到envoy-gateway-system
命名空间下的ack-gateway-config
配置项中。相关配置说明,请参见Envoy Gateway。
创建redis-service.yaml。
kind: Namespace apiVersion: v1 metadata: name: redis-system --- apiVersion: apps/v1 kind: StatefulSet metadata: name: redis namespace: redis-system labels: app: redis spec: serviceName: "redis" replicas: 1 selector: matchLabels: app: redis template: metadata: labels: app: redis spec: containers: - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:6.0.6-for-ack-gateway name: redis ports: - containerPort: 6379 resources: limits: cpu: 1500m memory: 512Mi requests: cpu: 200m memory: 256Mi --- apiVersion: v1 kind: Service metadata: name: redis namespace: redis-system labels: app: redis spec: ports: - name: redis port: 6379 protocol: TCP targetPort: 6379 selector: app: redis
创建enable-global-rate-limit.yaml。
apiVersion: v1 kind: ConfigMap metadata: name: ack-gateway-config namespace: envoy-gateway-system data: ack-gateway.yaml: | apiVersion: gateway.envoyproxy.io/v1alpha1 kind: EnvoyGateway rateLimit: backend: type: Redis redis: url: redis.redis-system.svc.cluster.local:6379
部署Redis服务并开启全局限流。
kubectl apply -f redis-service.yaml kubectl apply -f enable-global-rate-limit.yaml
步骤二:部署HTTPRoute资源
为后续测试创建HTTPRoute资源,后续的限流规则将会应用到此资源上。
创建httproute.yaml。
--- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http-ratelimit spec: parentRefs: - name: eg hostnames: - ratelimit.example rules: - matches: - path: type: PathPrefix value: / backendRefs: - group: "" kind: Service name: backend port: 3000
部署HTTPRoute资源。
kubectl apply -f httproute.yaml
获取Gateway的公网IP。
export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')
步骤三:场景演示
对指定用户进行限流
配置全局限流规则,限制请求头x-user-id值为one的请求每小时只能有3次访问。
创建backendtrafficpolicy.yaml。
apiVersion: gateway.envoyproxy.io/v1alpha1 kind: BackendTrafficPolicy metadata: name: policy-httproute spec: targetRefs: - group: gateway.networking.k8s.io kind: HTTPRoute name: http-ratelimit rateLimit: type: Global global: rules: - clientSelectors: - headers: - name: x-user-id value: one limit: requests: 3 unit: Hour
部署限流规则。
kubectl apply -f backendtrafficpolicy.yaml
测试带有
x-user-id: one
请求头的请求限流情况。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:49 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 731 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:50 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 730 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:52 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 728 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 727 date: Tue, 27 May 2025 07:47:52 GMT transfer-encoding: chunked
可以看到,前3次请求返回了
200
,第4次请求返回429
,说明限流规则限制了带有x-user-id: one
请求头的请求。测试带有
x-user-id: two
请求头的请求限流情况。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:11 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:12 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:14 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:15 GMT content-length: 504
可以看到,4次请求返回都是
200
,说明限流规则没有对带有x-user-id: two
请求头的请求进行限制。
对除管理员之外的其他用户进行分别限流
更新全局限流规则,对请求头x-user-id值为admin的请求不限流,对其他请求头值的请求限制每小时只能有3次访问。
编辑限流规则。
kubectl edit BackendTrafficPolicy policy-httproute
使用以下内容更新限流规则。
... rateLimit: type: Global global: rules: - clientSelectors: - headers: - type: Distinct name: x-user-id - name: x-user-id value: admin invert: true limit: requests: 3 unit: Hour
保存并退出后,限流规则即时生效。
测试带有
x-user-id: one
请求头的请求限流情况。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:49 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 731 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:50 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 730 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:52 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 728 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 727 date: Tue, 27 May 2025 07:47:52 GMT transfer-encoding: chunked
可以看到,前3次请求返回了
200
,第4次请求返回429
,说明限流规则限制了带有x-user-id: one
请求头的请求。测试带有
x-user-id: two
请求头的请求限流情况。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:38 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 382 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:39 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 381 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:41 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 379 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 378 date: Tue, 27 May 2025 07:53:41 GMT transfer-encoding: chunked
可以看到,前3次请求返回了
200
,第4次请求返回429
,说明限流规则限制了带有x-user-id: two
请求头的请求。测试带有
x-user-id: admin
请求头的请求限流情况。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: admin" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:44 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:45 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:46 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:47 GMT content-length: 506
可以看到,4次请求返回都是
200
,说明限流规则没有对带有x-user-id: admin
请求头的请求进行限制。
限制所有请求
更新全局限流规则,使所有请求每小时只能有3次访问。
编辑限流规则。
kubectl edit BackendTrafficPolicy policy-httproute
使用以下内容更新限流规则。
... rateLimit: type: Global global: rules: - limit: requests: 3 unit: Hour
保存并退出后,限流规则即时生效。
测试普通请求限流情况。
for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:53 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 3427 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:55 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 3425 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:56 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3424 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3423 date: Tue, 27 May 2025 08:02:57 GMT transfer-encoding: chunked
可以看到,前3次请求返回了
200
,第4次请求返回429
,说明限流规则已经生效。
根据客户端IP进行限流
更新全局限流规则,对一个IP段内的每个IP进行分别限制每小时只能有3次访问。
为了方便演示,本场景限制的IP网段为0.0.0.0/0
,您可以根据实际情况进行调整。
编辑限流规则。
kubectl edit BackendTrafficPolicy policy-httproute
使用以下内容更新限流规则。
... rateLimit: type: Global global: rules: - clientSelectors: - sourceCIDR: value: 0.0.0.0/0 type: Distinct limit: requests: 3 unit: Hour
保存并退出后,限流规则即时生效。
测试普通请求限流情况。
for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done
预期输出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:53 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 3427 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:55 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 3425 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:56 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3424 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3423 date: Tue, 27 May 2025 08:02:57 GMT transfer-encoding: chunked
可以看到,前3次请求返回了
200
,第4次请求返回429
,说明限流规则对0.0.0.0/0
的限流已经生效。
(可选)步骤四:清理测试资源
清理限流规则。
kubectl delete BackendTrafficPolicy policy-httproute
清理其他资源。
kubectl delete -f httproute.yaml kubectl delete -f redis-service.yaml kubectl delete -f enable-global-rate-limit.yaml