This document describes how to estimate and plan the resource capacity for a Fluss stream storage cluster based on your business requirements.
Resource assessment
Select a CU specification
Fluss on the public cloud uses standardized compute units (CUs) with a fixed vCPU-to-memory ratio of 1 vCPU to 4 GB. Choose specifications based on your business scale and associated Flink resource size.
|
CU specification |
Configuration |
Scenarios |
|
4 CU |
4 vCPU / 16 GB |
Getting Started/Development: Small-scale testing or development environments. Low-Throughput Production: Production workloads with low data traffic. Paired with Flink: Resource size < 200 CU. |
|
8 CU |
8 vCPU / 32 GB |
General Purpose (Recommended): A standard specification that balances performance and flexibility. Medium-Scale Production: Capable of handling mainstream business traffic. Paired with Flink: Resource size from 200 CU to 1,000 CU. |
|
16 CU |
16 vCPU / 64 GB |
High Performance/Large Storage: For use cases requiring extremely high throughput or a larger per-node storage limit. Massive-Scale Production: For mission-critical business pipelines. Paired with Flink: Resource size > 1,000 CU. |
Calculate Tablet Server count
Cluster size depends on your total throughput requirements. Use the following formula to estimate the total required CUs to determine the number of nodes:
Key metrics
-
Throughput unit: We recommend using Rows/s (rows per second) or MB/s (megabytes per second) as a consistent unit for your calculations.
-
Performance benchmark per CU:
-
Write capacity: Approximately 50,000 Rows/s (~46 MB/s).
Affected by data complexity and primary key update logic.
-
Read capacity: Approximately 50,000 Rows/s (~46 MB/s).
Affected by column pruning and query filter conditions.
-
-
Redundancy buffer (Recommended): We recommend reserving a 20% to 30% resource buffer to ensure cluster stability during peak traffic spikes.
Calculation example:
If your calculation requires 64 CU and you choose the 8 CU specification, you need 8 Tablet Server nodes.
Local storage planning
Local storage usage is closely related to the table type (log table or primary key table).
-
Configuration recommendation: We recommend using the default configuration initially.
-
Scaling rules: You can independently scale up disk capacity as your business grows, or scale out storage capacity by adding nodes.
-
Important limitation: Disks can be scaled up, but not down. Plan carefully to avoid over-provisioning.
Sample configurations
|
Scenario |
Write throughput |
Read throughput |
Table type |
Columns |
Total CUs |
Node configuration |
|
Low-Throughput Stream Processing |
250,000 Rows/s |
250,000 Rows/s |
log table |
20 |
12 CU |
3 × 4 CU |
|
Medium-Throughput Real-Time Analytics |
500,000 Rows/s |
700,000 Rows/s |
primary key table |
50 |
32 CU |
4 × 8 CU |
|
High-Throughput Real-Time Data Warehouse |
2,200,000 Rows/s |
2,500,000 Rows/s |
primary key table |
100 |
128 CU |
8 × 16 CU |
|
Massive-Scale Stream Processing |
5,000,000 Rows/s |
5,000,000 Rows/s |
log table |
30 |
256 CU |
16 × 16 CU |
|
Dimension table query service |
200,000 Rows/s |
300,000 Rows/s |
primary key table |
30 |
12 CU |
3 × 4 CU |
FAQ
Q: What is the difference between a cluster with 10 8-CU nodes and one with 5 16-CU nodes? Both configurations total 80 CU.
A: While the total compute performance is nearly identical, the main differences are storage limits and operational flexibility:
-
Storage limits: Each node has a maximum local disk capacity, for example, 2 TB. A cluster with fewer, larger nodes (16 CU) has less total disk capacity than one with more, smaller nodes (8 CU). If your business generates a large volume of data, more nodes typically mean more total storage space.
-
Scaling granularity: The 8 CU specification offers finer granularity. You can adjust resources in smaller increments, which provides more flexibility and improves cost control.