Terms
AnalyticDB for PostgreSQL is built on a Massively Parallel Processing (MPP) architecture. The concepts below help you size instances, configure data distribution, and understand query performance.
Massively Parallel Processing (MPP)
A distributed shared-nothing architecture where each node has its own dedicated CPU, memory, and storage. Because nodes do not share resources, the architecture scales horizontally without coordination overhead—storage and compute capacity grow in linear proportion to the number of nodes added, making it suitable for large-scale analytical workloads.
An AnalyticDB for PostgreSQL instance consists of multiple compute nodes arranged in the MPP architecture.
Instance
A provisioned AnalyticDB for PostgreSQL deployment. An instance consists of multiple compute nodes. Storage capacity and computing resources scale linearly as you add compute nodes.
Compute node
The basic unit of resource allocation in AnalyticDB for PostgreSQL. Each compute node provides a fixed allocation of CPU cores, memory, and storage, and holds one data partition.
Adding more compute nodes increases storage capacity without changing query response time, because each node processes its own portion of the data in parallel.
Number of compute nodes
The number of compute nodes provisioned for an instance. A single instance supports up to 4,096 compute nodes. Storage capacity and computing resources increase linearly with the number of nodes—add nodes to handle larger datasets or higher ingestion rates.
Data distribution
How table data is assigned across data partitions. Each row is placed in a partition according to a partition key.
AnalyticDB for PostgreSQL supports three distribution methods:
| Method | Description | When to use |
|---|---|---|
| Hash distribution | Rows are assigned to partitions based on the hash value of the partition key column. | Tables frequently joined on the same key, to minimize data movement during query execution. |
| Random distribution | Rows are distributed across partitions without a key. | When no single join key dominates, or to avoid data skew. |
| Replication distribution | The full table is copied to every compute node. | Small dimension tables joined frequently with large fact tables. |
Data partition
The unit of storage and computation in the MPP architecture. Each compute node holds one data partition.