Data distribution

更新时间:
复制 MD 格式

Bucketing and partitioning

  • Data bucketing — Distributes data into a fixed number of buckets based on a policy. This method improves read/write load balancing and join efficiency.

    • Hash bucketing — Distributes data based on the hash of a bucketing key for load balancing. This method is suitable for point queries and joins.

    • Sticky bucketing — Dynamically adjusts data distribution to mitigate hot spots.

    • Polling bucketing — Evenly distributes write operations when no bucketing key is present. This method is suitable for log scenarios.

  • Data partitioning — Organizes data into independent segments based on key values. This method accelerates queries and lifecycle management.

    • Manual partitioning — Users explicitly create or delete partitions.

    • Automatic partitioning — Automatically creates partitions and deletes expired partitions based on rules.

    • Dynamic partitioning — Automatically creates partitions based on the data during write operations.