Data distribution-Realtime Compute for Apache Flink(Flink)-阿里云帮助中心

Bucketing and partitioning

Data bucketing — Distributes data into a fixed number of buckets based on a policy. This method improves read/write load balancing and join efficiency.
- Hash bucketing — Distributes data based on the hash of a bucketing key for load balancing. This method is suitable for point queries and joins.
- Sticky bucketing — Dynamically adjusts data distribution to mitigate hot spots.
- Polling bucketing — Evenly distributes write operations when no bucketing key is present. This method is suitable for log scenarios.
Data partitioning — Organizes data into independent segments based on key values. This method accelerates queries and lifecycle management.
- Manual partitioning — Users explicitly create or delete partitions.
- Automatic partitioning — Automatically creates partitions and deletes expired partitions based on rules.
- Dynamic partitioning — Automatically creates partitions based on the data during write operations.