This topic describes the single-node query acceleration capabilities, performance benchmarks, and observability methods for PolarDB for MySQL columnar tables. By default, the optimizer determines whether to use a massively parallel processing (MPP) plan. After a query is dispatched to a specific node, the node relies on single-node columnar execution optimizations to reduce latency. This topic focuses on these node-level optimizations.
Background and value
As analytical query data volumes grow, single-node execution often becomes a bottleneck in the following areas:
Reading columns that the query does not need
Scanning data blocks that are ultimately filtered out
Repeatedly reading and decoding the same data pages
Insufficient single-node cache coverage for frequently accessed data
High CPU and memory bandwidth demands from large-table aggregation, joins, and sorting
The columnar execution acceleration for columnar tables addresses these bottlenecks by reading only the required columns, processing only the necessary data, and reusing decoded results. This reduces query latency and improves throughput.
From an execution pipeline perspective, these optimizations include:
Reading only the columns required by the query, eliminating unnecessary column scans.
Pushing pushable predicates down to the storage layer for early stripe-level and block-level pruning.
Using visibility views to filter files early, preventing irrelevant rows from entering aggregation, sorting, and expression computation stages.
Reusing OSS page cache and metadata cache to avoid redundant reads and decodes.
Delivering data to the execution layer in batches to leverage vectorized batch processing.
Relationship with multi-node parallel execution
This topic focuses on single-node execution acceleration. For multi-node parallel execution capabilities, use cases, and partition design for columnar tables, see Multi-node analysis.
To determine whether a specific SQL statement uses an MPP plan, check whether the execution plan contains an Exchange operator.
EXPLAIN SELECT /*+ SET_VAR(imci_plan_use_mpp=forced) */ COUNT(*) FROM nation;If the execution plan contains an Exchange operator, the SQL statement can use multi-node parallel execution. If the actual execution plan includes an Exchange operator, the statement runs as an MPP plan. If no Exchange operator is included, the statement runs as a single-node plan.
Performance benchmarks
The following table shows the TPC-H 100 GB benchmark results for columnar tables in single-node warm cache mode on a 32-core, 256 GB instance:
The TPC-H implementation in this topic is based on TPC-H benchmarking. These test results cannot be compared with published TPC-H benchmark results because the tests do not meet all TPC-H requirements.
Query | Time (s) |
Q1 | 1.175 |
Q2 | 0.178 |
Q3 | 0.577 |
Q4 | 0.433 |
Q5 | 0.522 |
Q6 | 0.366 |
Q7 | 0.633 |
Q8 | 0.528 |
Q9 | 2.817 |
Q10 | 0.935 |
Q11 | 0.218 |
Q12 | 0.535 |
Q13 | 1.255 |
Q14 | 0.442 |
Q15 | 0.889 |
Q16 | 0.553 |
Q17 | 0.738 |
Q18 | 2.381 |
Q19 | 0.759 |
Q20 | 0.453 |
Q21 | 1.308 |
Q22 | 0.299 |
TOTAL | 17.994 |