This topic introduces the multi-node parallel execution feature for columnar tables in PolarDB for MySQL. It covers the technical architecture, use cases, best practices, and performance test results.
If you have questions about the multi-node parallel execution feature for columnar tables, you can get support by joining the DingTalk group. In the group, you can directly @ a specialist with your questions.
DingTalk group number: 24490017825
Background
Columnar tables are the OLAP solution for PolarDB. As data volume, query complexity, and the need to query external tables such as those on OSS increase, a single read-only node can no longer meet the performance demands of large-scale data. To address this, columnar table clusters provide multi-node parallel execution and resource scaling capabilities.
Technical architecture
The multi-node parallel execution feature for columnar tables uses a group of read-only nodes to run queries in parallel. As your query workload changes, you can quickly add or remove read-only nodes to balance query performance and computing costs.
Requirements
Your PolarDB for MySQL cluster must meet the following requirements:
product version: Enterprise Edition.
kernel version: MySQL 8.0.2, with a minor kernel version of 8.0.2.2.34 or later.
Use cases
Use the resource scaling of multi-node parallel execution to increase CPU and IOPS, reducing query latency.
Improve data caching capacity by having each node process only a subset of the data.
Best practices
When you run analytical queries on large datasets, the primary goals are to minimize data shuffle and reduce disk I/O. The columnar tables in PolarDB use partition keys and sort keys to achieve superior query performance.
Partition key
A partition key distributes data across multiple nodes. The core idea is to keep related data together for local computation.
How it works
For partitioned tables in PolarDB that use HASH or KEY partitioning for primary or secondary partitions, the multi-node execution environment for columnar tables uses a Share-Nothing architecture. This means each partition is processed by a single, dedicated node.
Core advantages
Improved data caching: Each node processes only its assigned partitions, allowing it to use local memory more effectively for data caching.
Optimized query performance: For
JOINoperations andGROUP BYqueries based on the partition key, data is processed locally on each node. This significantly reduces data shuffle.
Best practices
Choose core business association keys: Use the columns most frequently used in
JOINorGROUP BYclauses as the partition key, such asorder_idoruser_id.Keep the number of partitions consistent: The tables involved in a
JOINoperation must have the same number of HASH/KEY partitions. Otherwise, local computation cannot be achieved.Use a large prime number for the partition count: Choose a sufficiently large prime number, such as 97 or 199, as the number of partitions. This helps ensure that data is distributed evenly across nodes and prevents data skew.
Sort key
A sort key organizes data within each physical partition. The core idea is to skip reading irrelevant data. For large datasets, you can filter data by using RANGE partitioning or by adding a sort key to a columnar table.
How it works
In columnar storage, data is organized into multiple data blocks. When you set a sort key, the data within each data block is physically sorted based on the specified column. When a query is run, the database uses metadata in the header of each data block, such as minimum and maximum values, to decide whether to read the block.
Core advantages
Efficient data pruning: When a
WHEREclause contains a filter on the sort key, the query engine can directly skip (or prune) entire data blocks whose data ranges do not match the filter conditions, reducing the required disk I/O.
Best practices
Select frequently filtered columns: Use columns that are frequently filtered in
WHEREclauses as the sort key. This is especially effective for columns used in range queries (>,<,BETWEEN) or high-cardinality equality queries.Combine with RANGE partitioning: Combine sort keys with
RANGEpartitioning to achieve multi-level filtering.
Performance test results
The following results are from a multi-node test on a TPC-H 1 TB dataset using only columnar tables. The test was run on a three-node MPP cluster, with each node configured with 32 cores and 256 GB of memory.
The TPC-H implementation in this topic is based on the TPC-H benchmark but does not fully comply with all TPC-H requirements. Therefore, the results cannot be compared with published TPC-H benchmark results.
Query | Time(s) |
Q1 | 8.189 |
Q2 | 0.567 |
Q3 | 1.784 |
Q4 | 1.38 |
Q5 | 2.268 |
Q6 | 2.359 |
Q7 | 2.068 |
Q8 | 1.593 |
Q9 | 10.761 |
Q10 | 10.054 |
Q11 | 0.795 |
Q12 | 7.595 |
Q13 | 10.848 |
Q14 | 3.023 |
Q15 | 2.286 |
Q16 | 2.253 |
Q17 | 11.651 |
Q18 | 25.612 |
Q19 | 11.557 |
Q20 | 2.739 |
Q21 | 4.82 |
Q22 | 3.141 |
TOTAL | 127.343 |
FAQ
Determine if a SQL query uses MPP
The multi-node parallel execution (MPP) feature for PolarDB columnar tables can accelerate complex queries. You can determine if a query uses this feature by analyzing its EXPLAIN execution plan. The key indicator is the presence of an Exchange operator.
Step 1: Force an MPP plan
First, use a specific hint to confirm if the optimizer can rewrite your SQL statement into an MPP plan.
Procedure
AddEXPLAINbefore your SQL statement and insert the/*+ SET_VAR(imci_plan_use_mpp=forced) */hint. This hint forces the optimizer to attempt to generate an MPP execution plan.Example
EXPLAIN SELECT /*+ SET_VAR(imci_plan_use_mpp=forced) */ COUNT(*) FROM nation;Interpreting the result
After you run the command, if the returned execution plan includes anExchangeoperator, this indicates that your SQL query can use multi-node parallel execution.Step 2: Check the actual execution plan
After confirming the potential, check if the optimizer chooses to execute your SQL in MPP mode under normal conditions (without hints).
Procedure
RunEXPLAINdirectly on your original SQL statement.Example
EXPLAIN SELECT COUNT(*) FROM nation;Interpreting the result
If the execution plan includes an
Exchangeoperator: This indicates that the optimizer, after a cost-based evaluation, chose MPP as the most efficient execution method for your query.If the execution plan does not include an
Exchangeoperator: This might mean one of the following:The query is too simple or involves too little data, so the optimizer determined that single-node execution is faster.
The structure of the SQL query or the table, such as the partition key settings, prevents the use of MPP. You can return to Step 1 to check and optimize your SQL query or table design.