In PolarDB for MySQL, both a full refresh and the initial run of an incremental refresh for a materialized view support two parallel write modes: ETL and ePQ. The optimal mode for improving refresh efficiency depends on workload characteristics, such as wide and narrow tables, the cluster series (Cluster Edition or Multi-primary Cluster), and whether compute offloading is needed.
Overview
To improve execution efficiency, PolarDB for MySQL provides the following two parallel write modes for a complete refresh or an initial incremental refresh of a materialized view:
Mode | Activation | Architecture | Use cases |
ETL Parallel Write (Default) | Default mode (automatically enabled when parallelism > 1). | This mode offloads queries to remote nodes, where multiple workers consume and write the results in parallel. | Wide tables, large data volumes, and scenarios requiring query offloading. |
ePQ Parallel Write | Must be enabled manually. | The PQ framework splits the | Narrow tables and scenarios requiring end-to-end parallelism. |
Both modes require materialized_view_refresh_parallelism to be set to a value greater than 1.
Compatibility
Version | Parallel refresh modes | Notes |
PolarDB for MySQL 8.0.1 with minor kernel version 8.0.1.1.54 or later | ETL mode | ETL mode only. |
PolarDB for MySQL 8.0.2 with minor kernel versions 8.0.2.2.34 and 8.0.2.2.35 | ePQ mode |
|
Refresh process
A full refresh of a materialized view consists of three stages:
Create a container table: A new container table is created from the materialized view definition. If
materialized_view_implicit_partition_num > 1, a hash partition is automatically added to prevent lock contention during parallel writes.NoteMySQL 8.0.1 does not support the
materialized_view_implicit_partition_numparameter.Write data in parallel: The system writes data in parallel according to the selected mode.
Build a columnar index (Optional): If
materialized_view_add_columnar_index_after_insert = ON, the system first completes the refresh and then builds a columnar index (IMCI) for the container table.
ETL parallel write mode
How it works
PolarDB selects a remote node (a column-store index read-only node or a row-store read-only node) based on the offload policy and establishes a connection.
The primary (RW) node sends the materialized view's definition query to the remote node for execution.
The remote node streams the result set back. Compressed data transmission is supported.
The primary (RW) node creates multiple worker threads. Each worker thread fetches rows in parallel and writes them to different partitions of the container table.
Features
Query computation runs on the remote node and does not consume CPU resources on the primary (RW) node.
Write operations run locally on the primary (RW) node to ensure transactional consistency.
In wide-table scenarios, parallel efficiency is not affected by row length.
Supports multi-primary cluster (Limitless).
Usage
The method for modifying PolarDB cluster parameters differs between the console and a database session. The differences are as follows:
In the PolarDB console
Compatibility: Some cluster parameters in the PolarDB console have the loose_ prefix for compatibility with MySQL configuration files.
Procedure: Find and modify the parameters that have the
loose_prefix.
In a database session (using the command line or a client)
Procedure: When you connect to the database and use the
SETcommand to modify a parameter, remove theloose_prefix and use the original parameter name.
Parameter | Example value | Modifiable per session | Description |
| 16 | No | Sets the parallelism. A value greater than 1 enables parallel refresh. |
| FORCE_IMCI | Yes | Sets the offload policy for manual refreshes on a per-session basis. |
After configuring the parameters, run the following command to refresh the materialized view:
-- Refresh the materialized view
REFRESH MATERIALIZED VIEW <schema_name>.<mview_name>;ePQ parallel write mode
How it works (MPP offloading to a remote column-store index read-only node when the offload policy is FORCE_IMCI)
The system schedules PQ Workers to a remote column-store index read-only node to perform a SELECT scan.
The PQ Workers send the scan results back to the primary (RW) node over the internal network, which supports compression.
Multiple PQ Workers scan the source table data in parallel on the primary (RW) node.
The PQ Workers write the scan results in parallel to the partitions of the container table.
Finally, the primary (RW) node completes the data write locally.
Features
End-to-end parallelism: A single parallel execution plan handles both scans and writes.
Near-linear scaling for narrow tables.
Avoids the single-port network transfer limitation of ETL.
Currently not supported in Limitless clusters. The operation automatically falls back to serial execution.
Network compression (MPP offload scenario)
Prerequisites
Parameter | Example value | Settable at session level | Description |
| ZSTD | Yes | Specifies the compression algorithm for data transferred from a replica node during a materialized view refresh. Must be set to ZSTD to enable compression.
|
| 3 | Yes | Specifies the compression level. If this parameter is set to 0 (the default) or an invalid value, the system uses level 3 for the ZSTD algorithm.
|
Impact
Reduces the amount of data transferred between nodes by 60%–80%.
Compression and decompression add CPU overhead. This trade-off must be considered in CPU-bound scenarios.
Recommendations
Compressed transfer is recommended only for wide tables. For narrow tables, the CPU overhead from compression might offset the network benefits, so using the UNCOMPRESSED setting is recommended. If your materialized view is a wide table and requires execution offloading, consider using ETL mode directly instead of ePQ mode with MPP offload. ETL mode is inherently well-suited for wide tables.
Procedure
Scenario A: MPP offload to remote node
Parameter | Example value | Settable at session level | Description |
| 16 | No | Sets the degree of parallelism. A value greater than 1 enables parallel refresh. |
| 16 | Yes | Sets the number of implicit partitions. Note The |
| FORCE_IMCI | Yes | For a manual refresh, you can specify a different offload policy for each session. |
| ON | No | Allows offload scheduling to a column-store index read-only node. |
| ZSTD | Yes | Enables compressed transfer (recommended for wide tables; can be skipped for narrow tables). |
| 3 | Yes | Sets the compression level for compressed transfers. |
After you configure these parameters, perform the refresh:
-- Perform the refresh
REFRESH MATERIALIZED VIEW <schema_name>.<mview_name>;Scenario B: ePQ offload to a specific node
If your cluster has multiple column-store index read-only nodes, you can use pq_mpp_allow_dispatch_to_imci_node to control which of these nodes can receive ePQ MPP scheduling requests. This parameter must be set to ON on both the primary (RW) node and the target column-store index read-only node. The system does not target a column-store index read-only node for scheduling if this parameter is not enabled on that node.
-- 1. On the primary (RW) node, disable its ability to act as a PQ Worker resource.
-- In the console, set pq_mpp_allow_imci_node = OFF;
-- 2. On the primary (RW) node, allow scheduling to an IMCI node.
-- In the console, set pq_mpp_allow_dispatch_to_imci_node = ON;
-- 3. Connect to the IMCI node that should handle the offloaded query, and enable the scheduling entry point only on that node.
-- In the console, set pq_mpp_allow_dispatch_to_imci_node = ON;
-- 4. Return to the primary (RW) node and refresh with the IMCI offload policy.
SET SESSION materialized_view_complete_refresh_offload_policy = FORCE_IMCI;
REFRESH MATERIALIZED VIEW <schema_name>.<mview_name>;After enabling this feature, use information_schema.cluster_group_resource_view to check the status of the target node. The target column-store index read-only node should show allow_to_dispatch, and its EXTRA column should contain Allow to dispatch to IMCI node. In this mode, the column-store index read-only node only receives scheduling requests and does not act as a PQ Worker. To have all column-store index read-only nodes act as ePQ Worker resources, set pq_mpp_allow_imci_node = ON.
Offload target node selection
When the offload policy is not MASTER, the system selects a remote node to execute the SELECT query.
Policy determines node type
| Candidate node type | Description |
| IMCI read-only node | Falls back to local execution on the master (RW) node and logs a warning. |
| IMCI read-only node | Fails with an error. |
| row-store read-only node | Falls back to local execution on the master (RW) node and logs a warning. |
| row-store read-only node | Fails with an error. |
| IMCI read-only node or row-store read-only node (based on cost) | Selects the available node type if only one is available. If neither is available, it falls back to the master (RW) node. |
Decision logic for the COST policy
When materialized_view_complete_refresh_offload_policy = COST, the system determines the target node by comparing the query's optimizer cost to a threshold:
If the query cost ≥
cost_threshold_for_imci: Offload to an IMCI read-only node. IMCI is suitable for high-cost scan and aggregation queries.If the query cost <
cost_threshold_for_imci: Offload to a row-store read-only node. Row-store read-only nodes are sufficient for lightweight queries.
-- Set the cost threshold (default is 0, which means all queries tend to be offloaded to IMCI).
SET SESSION cost_threshold_for_imci = 100000;
-- Use the COST policy.
SET SESSION materialized_view_complete_refresh_offload_policy = COST;Multi-node load balancing
If multiple nodes of the same type are available, the system uses a Round-Robin approach to select a node:
Each refresh task selects the next node based on a global counter.
If the connection to the selected node fails, the system automatically tries the next node in the list until all nodes have been attempted.
The system distributes refresh tasks for different materialized views evenly across the nodes.
To offload ePQ + MPP tasks to an IMCI node, use one of the following two methods:
Enable
pq_mpp_allow_imci_nodeto support offloading to all IMCI read-only nodes.First, set
pq_mpp_allow_imci_node = OFF, then usepq_mpp_allow_dispatch_to_imci_nodeto narrow down the candidate IMCI read-only nodes:On the master (RW) node, set
pq_mpp_allow_dispatch_to_imci_node = ONto allow the master (RW) node to accept the dispatch entry-point status from remote IMCI read-only nodes.On each IMCI read-only node that you want to handle offloaded queries, set
pq_mpp_allow_dispatch_to_imci_node = ON.On each IMCI read-only node that you do not want to handle offloaded queries, keep
pq_mpp_allow_dispatch_to_imci_node = OFF.
Recommended policies
In a production environment, use FORCE_IMCI or FORCE_TPRO. Avoid PREFER_* policies.
Reason: If the target node is unavailable, a PREFER_* policy silently falls back to local execution on the master (RW) node. This fallback causes the INSERT SELECT statement to perform a full scan and write on the master (RW) node. This creates a large transaction and can impact the performance of online workloads on the master (RW) node.
Configuration examples
-- Recommended: Force the use of an IMCI node. If unavailable, the operation fails instead of falling back.
SET SESSION materialized_view_complete_refresh_offload_policy = FORCE_IMCI;
-- Use only row-store read-only nodes. If unavailable, the operation fails.
SET SESSION materialized_view_complete_refresh_offload_policy = FORCE_TPRO;
-- Automatically choose between IMCI and TPRO based on query cost.
SET SESSION materialized_view_complete_refresh_offload_policy = COST;
SET SESSION cost_threshold_for_imci = 50000;Parameter scopes and settings
Basic rules
GLOBALparameters: are set only in the console and apply to all sessions and automatic refresh threads.SESSIONparameters: You can set them for the current session by usingSET SESSION. You can also modify their default values in the console. Automatic refresh threads use theGLOBALdefault values.
Parameter scopes for ETL parallel write mode
Parameter | Scope | Settable per session | Description |
| GLOBAL | No, must be GLOBAL | All refresh tasks share the same parallelism. |
| SESSION | Yes | For manual refreshes, you can specify a different offload policy for each session. |
| SESSION | Yes | Allows you to disable atomic auto-increment optimization. |
| SESSION | Yes | Enables result set caching. |
Parameter scopes for ePQ parallel write mode
Parameter | Scope | Settable per session | Description |
| GLOBAL | No, must be GLOBAL | All refresh tasks share the same parallelism. |
| GLOBAL | No, must be GLOBAL | Controls whether MPP tasks can be scheduled to an IMCI node. |
| GLOBAL | No, must be GLOBAL | Allows the use of the IMCI scheduling entry point. This parameter must be enabled on both the primary (RW) node and the target read-only node that contains the columnar index. |
| SESSION | Yes | For manual refreshes, you can specify a different offload policy for each session. |
| SESSION | Yes | Allows each materialized view to use a different number of partitions. |
| SESSION | Yes | Specifies the transport compression algorithm for MPP offloads. The system automatically propagates this setting to PQ workers. |
| SESSION | Yes | Specifies the compression level. The system automatically propagates this setting to PQ workers. |
Notes
If you use only manual refreshes, you can adjust
SESSIONparameters for each session.If you rely on automatic refreshes, you must set appropriate default values for all
SESSIONparameters in the console.GLOBALparameters must be set in the console, regardless of the refresh method. Changes take effect immediately for all subsequent tasks.SET SESSIONsettings do not persist after a cluster restart. In contrast,GLOBALparameters modified in the console are persistent.
Parameter reference
Core parameters
Parameter | Scope | Value | Default | Description |
| GLOBAL | 1–1024 | 1 | The number of parallel workers. A value greater than 1 enables parallel refresh. |
| SESSION | 0 to unlimited | 1 | The number of implicit HASH partitions for the container table. A value greater than 1 enables implicit HASH partitioning. Note MySQL 8.0.1 does not support the |
| SESSION | ON/OFF | ON | Improves performance by using atomic auto-increment values for the implicit partition column. |
Compute offloading parameters
Parameter | Scope | Value | Default | Description |
| SESSION |
| Specifies the policy for offloading query computation. | |
| GLOBAL | ON/OFF | OFF | Allows PQ workers to be scheduled on a read-only IMCI node (required for ePQ mode). |
| GLOBAL | ON/OFF | OFF | Allows the use of the IMCI dispatch entry point. This parameter must be enabled on both the primary (RW) node and the target read-only IMCI node. |
Network compression parameters
Parameter | Scope | Value | Default | Description |
| SESSION |
| UNCOMPRESSED | Specifies the compression algorithm for the internal PQ channel. This setting applies only in ePQ MPP offload mode. |
| SESSION | 0–22 | 0 | Specifies the ZSTD compression level for the PQ channel. If set to 0, the ZSTD default level (3) is used. |
| SESSION | ON/OFF | OFF | Specifies whether to cache the entire result set in memory. This setting is effective only in ETL mode. |
How compression works:
Mode | Compression location | Effective condition |
ePQ (MPP offload) | Internal PQ channel between the primary (RW) node and remote nodes. |
|
ePQ (local only) | No network transfer. | Not applicable. |
The materialized_view_etl_compression_level parameter takes effect only when materialized_view_etl_compression_algorithm = ZSTD. If set to 0, the ZSTD default level of 3 is used. If set to a value from 1 to 22, the specified level is used. A higher level provides a better compression ratio but increases CPU overhead.
Offload policy (materialized_view_complete_refresh_offload_policy parameter)
Value | Description |
| The primary (RW) node executes all operations locally. |
| Prioritizes offloading to a read-only TPRO node. Falls back to the primary (RW) node if unavailable. |
| Prioritizes offloading to a read-only IMCI node. Falls back to the primary (RW) node if unavailable. |
| Forces computation offloading to a read-only TPRO node. If no such node is available, the operation fails. |
| Forces computation offloading to a read-only IMCI node. If no such node is available, the operation fails. |
| Uses a cost-based model to automatically select either a read-only IMCI node or a read-only TPRO node for offloading. |
Mode selection recommendations
Scenario | Recommended mode | Key configuration |
Narrow table with no remote nodes | Local-only ePQ |
|
Narrow table with column-store-indexed read-only nodes or row-store read-only nodes | ETL/ePQ with MPP offload |
|
Wide table or table containing large fields | ETL |
|
ETL | Use ETL mode. ePQ is not available for multi-master clusters. |
Considerations
No effect when the degree of parallelism is 1: Both parallel modes degrade to a serial INSERT SELECT operation.
Remote node availability: Both ETL mode and ePQ MPP offloading depend on remote nodes. If a remote node is unavailable,
FORCE_*policies return an error, whilePREFER_*policies fall back to local execution on the primary (RW) node.Multi-primary environment limitations: In a multi-primary cluster (Limitless), ePQ mode is automatically disabled and falls back to serial execution. Use ETL mode for parallel refreshing.
IMCI column coverage: When you use ePQ MPP offloading to a read-only node with a columnar index, the columnar index must cover all columns read by the materialized view query. Otherwise, the system returns an error. ETL mode does not have this limitation.
Implicit partition count: We recommend setting this value to 0.5 to 2 times the degree of parallelism. Too few partitions can lead to write contention, while too many can increase file management overhead.
Write location: In both modes, data writes are always completed locally on the primary (RW) node. Offloading only affects where the SELECT query is executed.
Parameter persistence: Settings configured with
SET SESSIONdo not persist across cluster restarts, whereas changes to GLOBAL parameters made in the console persist.Compression vs. CPU trade-off: For ePQ MPP offloading, we recommend setting
materialized_view_etl_compression_algorithm=ZSTD. For narrow tables, the CPU overhead from compression may offset the network benefits.