This topic provides capacity thresholds and Queries Per Second (QPS) performance data for different specifications of standard cloud-native gateway instances to help you with your selection. Use this reference to select the right gateway specification for your workload. Each specification defines per-node capacity thresholds and queries-per-second (QPS) benchmarks. In addition, for Serverless instances of the cloud-native gateway, you do not need to perform a detailed capacity assessment. The system automatically performs elastic scaling and billing based on your business traffic, and the scaling process includes a protection threshold to prevent uncontrolled costs.
Select a specification
Match two dimensions to your workload: capacity thresholds (connections, bandwidth, resource utilization) and QPS throughput (request processing rate).
Check capacity thresholds to find the specification that handles your connection count, bandwidth, and resource usage.
Check QPS benchmarks to verify throughput for your traffic pattern (connection type, HTTPS, GZIP, response size).
If one dimension fits but the other does not, choose the next higher specification.
For a worked example, see Sizing example.
Capacity thresholds
The following tables list per-node capacity thresholds for each gateway specification across three tiers:
| Tier | Meaning | Action |
|---|---|---|
| Security threshold | The gateway handles current traffic and can absorb a 2x traffic spike without degradation. | No action required. Normal operation. |
| Warning threshold | Latency may increase. Traffic surges risk dropped connections and timeouts. | Monitor closely. Consider adding nodes or upgrading the specification. |
| Overload threshold | The gateway rejects new connections to protect itself from failure. | Add nodes or upgrade the specification to reduce load. |
Deploy at least two nodes per gateway. A single-node deployment may not meet your service-level agreement (SLA) objectives. All thresholds below apply to each individual node.
Client connections
| Threshold | 2 cores, 4 GiB | 4 cores, 8 GiB | 8 cores, 16 GiB | 16 cores, 32 GiB |
|---|---|---|---|---|
| Security | 12,000 | 24,000 | 48,000 | 96,000 |
| Warning | 24,000 | 48,000 | 96,000 | 192,000 |
| Overload | 40,000 | 80,000 | 160,000 | 320,000 |
New HTTPS connections per second
| Threshold | 2 cores, 4 GiB | 4 cores, 8 GiB | 8 cores, 16 GiB | 16 cores, 32 GiB |
|---|---|---|---|---|
| Security | 400 | 800 | 1,600 | 3,200 |
| Warning | 800 | 1,600 | 3,200 | 6,400 |
| Overload | - | - | - | - |
Network bandwidth (Gbit/s)
| Threshold | 2 cores, 4 GiB | 4 cores, 8 GiB | 8 cores, 16 GiB | 16 cores, 32 GiB |
|---|---|---|---|---|
| Security | 1 | 2 | 4 | 8 |
| Warning | 1 | 2 | 4 | 8 |
| Overload | - | - | - | - |
CPU utilization
| Threshold | All specifications |
|---|---|
| Security | 30% |
| Warning | 60% |
| Overload | 90% |
Memory usage
| Threshold | All specifications |
|---|---|
| Security | 75% |
| Warning | 75% |
| Overload | 90% |
QPS benchmarks
QPS throughput varies based on four factors:
| Factor | Impact |
|---|---|
| Connection type | Persistent connections deliver higher QPS than short-lived connections by skipping repeated TCP and TLS handshakes. |
| HTTPS | TLS handshakes on new connections are CPU-intensive, reducing QPS significantly for short-lived connections. For persistent connections, the handshake occurs only once, so the impact is smaller. |
| GZIP compression | Compressing responses adds CPU overhead, which reduces QPS. |
| Response size | Larger responses consume more bandwidth and processing time per request. |
The following values are pessimistic estimates (worst-case) measured at CPU utilization below 30%.
For workloads with a high rate of simultaneous new HTTPS connections, use the short-lived connection rows to estimate capacity.
Short-lived connections, 1 KB response
| HTTPS | GZIP | 2c4g (3 nodes) | 2c4g (5 nodes) | 4c8g (3 nodes) | 4c8g (5 nodes) | 8c16g (3 nodes) | 8c16g (5 nodes) | 16c32g (3 nodes) | 16c32g (5 nodes) |
|---|---|---|---|---|---|---|---|---|---|
| No | No | 5,200 | 8,700 | 10,500 | 17,500 | 21,000 | 35,000 | 42,000 | 70,000 |
| Yes | No | 1,600 | 2,700 | 3,200 | 5,500 | 6,500 | 11,000 | 13,000 | 22,000 |
Persistent connections, 1 KB response
| HTTPS | GZIP | 2c4g (3 nodes) | 2c4g (5 nodes) | 4c8g (3 nodes) | 4c8g (5 nodes) | 8c16g (3 nodes) | 8c16g (5 nodes) | 16c32g (3 nodes) | 16c32g (5 nodes) |
|---|---|---|---|---|---|---|---|---|---|
| No | No | 6,500 | 10,800 | 13,000 | 21,700 | 26,000 | 43,500 | 52,000 | 87,000 |
| Yes | No | 6,000 | 10,000 | 12,000 | 20,000 | 24,000 | 40,000 | 48,000 | 80,000 |
| Yes | Yes | 5,200 | 8,700 | 10,500 | 17,500 | 21,000 | 35,000 | 42,000 | 70,000 |
Persistent connections, 10 KB response
| HTTPS | GZIP | 2c4g (3 nodes) | 2c4g (5 nodes) | 4c8g (3 nodes) | 4c8g (5 nodes) | 8c16g (3 nodes) | 8c16g (5 nodes) | 16c32g (3 nodes) | 16c32g (5 nodes) |
|---|---|---|---|---|---|---|---|---|---|
| No | No | 5,600 | 9,300 | 11,200 | 18,700 | 22,500 | 37,500 | 45,000 | 75,000 |
| Yes | No | 5,300 | 9,000 | 10,700 | 18,000 | 21,500 | 36,000 | 43,000 | 72,000 |
| Yes | Yes | 3,100 | 5,200 | 6,200 | 10,500 | 12,500 | 21,000 | 25,000 | 42,000 |
Sizing example
Suppose your workload has the following characteristics:
30,000 concurrent client connections
Persistent HTTPS connections with GZIP compression
Average response size: 1 KB
Target QPS: 15,000
Step 1: Check capacity thresholds.
A 4-core, 8 GiB node supports 24,000 client connections at the security threshold. At 30,000 connections, you exceed the security threshold but stay below the warning threshold (48,000). For production workloads, choose the 8-core, 16 GiB specification. This keeps 30,000 connections well within the security threshold (48,000).
Step 2: Check QPS benchmarks.
For persistent HTTPS connections with GZIP and 1 KB responses, the 8-core, 16 GiB specification delivers:
3 nodes: 21,000 QPS
5 nodes: 35,000 QPS
A 3-node cluster at 21,000 QPS exceeds the 15,000 QPS target.
Result: Deploy an 8-core, 16 GiB gateway with 3 nodes.