Incremental materialized views

更新时间:
复制 MD 格式

An incremental materialized view (IVM) is an advanced form of a materialized view. Unlike a full refresh that recomputes the entire query, an incremental refresh only applies data changes (INSERT, UPDATE, and DELETE) from the base tables to the view. This near real-time process significantly reduces system load and data latency.

How it works

The core principle of an incremental materialized view is to process only what has changed, not the entire dataset.

When INSERT, UPDATE, or DELETE operations occur on a base table, the system automatically records these changes. This process runs efficiently on an In-Memory Columnar Index (IMCI) columnar storage node, requires no binlogs or triggers, and does not lock the base table.

  • FAST (incremental refresh): Processes only the data that has changed since the last refresh, resulting in high speed and low overhead.

  • COMPLETE (full refresh): Recomputes the entire query. This is suitable for initial data population or data correction.

Prerequisites

To use an incremental materialized view, your system must meet the following requirements:

  • Cluster: Your PolarDB for MySQL cluster must meet one of the following requirements:

    • MySQL 8.0.2 with minor kernel version 8.0.2.2.35 or later.

  • In-Memory Columnar Index (IMCI) parameters:

    Parameter

    Default

    Value

    Description

    imci_enable_window_function

    2

    >= 2

    Enables support for IMCI window functions.

    imci_enable_nci_async_pre_commit

    OFF

    OFF

    Disables NCI async pre-commit to ensure incremental data consistency.

    Note

    This parameter is OFF by default and cannot be changed at this time.

    imci_enable_hybrid_plan

    ON

    OFF

    Disables the hybrid execution plan to force incremental refreshes to execute on the read-only columnar storage node.

    Note

    You can modify these parameters in the PolarDB console.

Key benefits

Benefit

Description

Low-latency data synchronization

This keeps the view data in sync with the base table with minimal latency, reducing refresh times from minutes to seconds or even milliseconds. It avoids full scans of the base table by processing only incremental changes.

Reduced system load

Significantly reduces CPU, memory, and I/O consumption. The benefit is especially noticeable in scenarios with large base tables and a small volume of changes.

Seamless use of IMCI

Incremental computations run on the IMCI columnar storage node, fully leveraging the high-speed aggregation capabilities of columnar storage.

Accelerated analytical queries

Provides millisecond-level response times for complex aggregation and join queries in HTAP (Hybrid Transactional/Analytical Processing) scenarios.

Automated background maintenance

The system automatically manages the lifecycle of the incremental log (delta table).

Use cases

  • Pre-computation for analysis: Pre-process complex multi-table joins into a wide table to avoid costly join operations on every query.

  • Report acceleration: Provide pre-computed data for real-time data warehouse or BI reports, reducing update overhead by over 90% with incremental refreshes.

  • Computation reuse: Reuse computation results by creating nested materialized views.

  • Real-time data warehouse: Support analytical queries that reflect base table changes in near real-time.

  • HTAP workloads: Synchronize frequently updated OLTP data to an analytical view with low latency, separating TP and AP workloads.

Limitations

  • General limitations:

    • Unsupported syntax: Incremental refresh is not supported for queries that contain a subquery, UNION, ORDER BY, LIMIT, window function, HAVING, ROLLUP, or CTE.

    • UNION ALL: Not currently supported.

    • Base table requirements:

      • The base table must have IMCI enabled (polar_enable_imci = ON), and all columns in the base table must be included in the IMCI.

      • The base table cannot be a standard view or a non-incremental materialized view. Nested materialized views are allowed, provided the inner view also supports incremental refresh.

    • Storage overhead:

      • A materialized view stores a copy of the data, which consumes additional storage space.

      • The system also consumes extra storage to log changes for incremental refreshes. This storage usage does not grow indefinitely, as obsolete data is periodically reclaimed.

  • Limitations for single-table filtering:

    • The base table must have an explicit primary key.

    • The DISTINCT keyword is not supported.

  • Limitations for single-table aggregation:

    • The base table must have an explicit primary key.

    • The GROUP BY clause must reference single columns and cannot use expressions. For example, GROUP BY YEAR(create_time) is not supported; you must change it to GROUP BY create_year.

    • Supported aggregate functions: COUNT, SUM, and AVG.

    • Unsupported aggregate functions: STDDEV, VARIANCE, MIN, and MAX.

    • Aggregation without a GROUP BY clause is not currently supported.

    • Multi-table aggregation is not currently supported.

  • Limitations for multi-table joins

    • All base tables must have an explicit primary key.

    • RIGHT JOIN is not supported.

    • Joins must follow a left-deep tree structure. Nested structures like A JOIN (B LEFT JOIN C) JOIN D are not allowed.

    • The ON condition for each table must include a column from another table. Syntax such as A JOIN B ON B.id = 1 is not allowed.

Syntax

CREATE MATERIALIZED VIEW mv_name
  [REFRESH [COMPLETE | FAST] [ON DEMAND]]
  [START WITH now()] [NEXT now() + interval 5 second]
  AS SELECT ... FROM base_table;
  • FAST: Specifies incremental refresh mode, which processes only the changes from the base table.

  • COMPLETE: Specifies full refresh mode, which recomputes the entire query each time.

  • ON DEMAND: Specifies an on-demand refresh, which must be triggered manually or by a predefined background refresh interval.

Examples

Example 1: Single-table filter (FILTER type)

  1. Create a base table with the columnar index enabled.

    CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT, c3 INT) COMMENT 'COLUMNAR=1';
    
    INSERT INTO t1 VALUES (1, 1, 1);
  2. Create an incremental materialized view.

    CREATE MATERIALIZED VIEW mv_filter REFRESH FAST ON DEMAND AS
      SELECT * FROM t1 WHERE c2 < 5;

    Perform a full refresh to populate the initial data.

    REFRESH MATERIALIZED VIEW mv_filter;
  3. Perform an incremental refresh after the base table data changes.

    -- Change the base table
    INSERT INTO t1 VALUES (2, 2, 3);
    DELETE FROM t1 WHERE c1 = 1;
    
    -- Perform an incremental refresh to process only the changes above
    REFRESH MATERIALIZED VIEW mv_filter;

Example 2: Aggregation statistics (AGGREGATE type)

-- Create a base table with the columnar index enabled
CREATE TABLE orders (
    order_id INT PRIMARY KEY,                  -- Unique order ID (primary key)
    user_id INT NOT NULL,                      -- User ID (GROUP BY column)
    amount DECIMAL(12, 2) NOT NULL,            -- Order amount (aggregation column)
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP -- Order time (standard business field)
)COMMENT 'COLUMNAR=1';

-- Aggregate statistics
CREATE MATERIALIZED VIEW mv_user_order_stats
  REFRESH FAST ON DEMAND
  AS SELECT
       user_id,
       COUNT(*)                   AS order_count,
       SUM(amount * 2)            AS total_amount,
       AVG(CAST(amount AS INT))   AS avg_amount
     FROM orders
     GROUP BY user_id;

Supported aggregate functions: COUNT(*), COUNT(expr), SUM(expr), and AVG(expr).

Example 3: Multi-table INNER JOIN

  1. Create the base tables.

    CREATE TABLE t4 (id4 INT PRIMARY KEY, c2 INT, c3 VARCHAR(50)) COMMENT 'COLUMNAR=1';
    CREATE TABLE t5 (id5 INT PRIMARY KEY, c2 INT, c3 VARCHAR(50)) COMMENT 'COLUMNAR=1';
    CREATE TABLE t6 (id6 INT PRIMARY KEY, c2 INT, c3 VARCHAR(50)) COMMENT 'COLUMNAR=1';
    
    INSERT INTO t4 VALUES (1, 10, 'A'), (2, 20, 'B'), (3, 30, 'C');
    INSERT INTO t5 VALUES (1, 100, 'X'), (2, 200, 'Y'), (4, 400, 'Z');
    INSERT INTO t6 VALUES (1, 1000, 'P'), (2, 2000, 'Q'), (3, 3000, 'R');

    Create an incremental materialized view for a multi-table INNER JOIN.

    CREATE MATERIALIZED VIEW mv_inner_with_where_on REFRESH FAST ON DEMAND AS
      SELECT t4.id4, t4.c2, t5.id5, t5.c2 AS t5_c2, t6.id6, t6.c2 AS t6_c2
      FROM t4
      INNER JOIN t5 ON t4.id4 = t5.id5
      INNER JOIN t6 ON t4.id4 = t6.id6
      WHERE t4.c2 > 5 AND t5.c2 > 50;
  2. Perform a full refresh and verify.

    -- Perform a full refresh to populate the initial data
    REFRESH MATERIALIZED VIEW mv_inner_with_where_on;
    
    -- Change the base tables
    INSERT INTO t4 VALUES (4, 40, 'D');
    INSERT INTO t6 VALUES (4, 4000, 'S');
    UPDATE t4 SET c2 = 15 WHERE id4 = 1;
    DELETE FROM t5 WHERE id5 = 4;
    INSERT INTO t5 VALUES (5, 400, 'W');
    
    -- Perform an incremental refresh
    REFRESH MATERIALIZED VIEW mv_inner_with_where_on;

Example 4: LEFT JOIN

-- Create base tables with the columnar index enabled
-- Create the products table (joined table / right table)
CREATE TABLE products (
    product_id INT PRIMARY KEY,                -- Join key, must be the primary key
    product_name VARCHAR(100) NOT NULL,        -- Product name
    price DECIMAL(10, 2)                       -- Other business fields
) COMMENT 'COLUMNAR=1';                       

-- Create the order items table (primary table / left table)
CREATE TABLE order_items (
    item_id INT PRIMARY KEY,                   -- Item ID, must be the primary key
    order_id INT NOT NULL,                     -- Order ID
    product_id INT,                            -- Join key, can be NULL (because this is a LEFT JOIN)
    quantity INT NOT NULL,                     -- Purchase quantity
    FOREIGN KEY (product_id) REFERENCES products(product_id) -- Optional foreign key constraint
) COMMENT 'COLUMNAR=1';                       


-- LEFT JOIN
CREATE MATERIALIZED VIEW mv_order_with_product
  REFRESH FAST ON DEMAND
  AS SELECT
       oi.item_id,
       oi.order_id,
       p.product_name,
       oi.quantity
     FROM order_items oi
     LEFT JOIN products p ON oi.product_id = p.product_id;

Example 5: Nested materialized view (computation reuse)

A nested materialized view is built upon another incremental materialized view, allowing you to reuse computation results. For example, you can first filter valid data from a base table, and then perform aggregation on the filtered result. During a refresh, each layer processes only its respective incremental changes.

  1. Create a base table.

    CREATE TABLE order_log (
      order_id    INT PRIMARY KEY,
      user_id     INT,
      product_category VARCHAR(50),
      amount      DECIMAL(10,2),
      status      VARCHAR(20),
      created_year INT
    ) COMMENT 'COLUMNAR=1';
    
    INSERT INTO order_log VALUES
      (1, 101, 'electronics', 299.00, 'paid', 2025),
      (2, 102, 'clothing',     59.90, 'paid', 2025),
      (3, 103, 'electronics', 499.00, 'cancelled', 2025),
      (4, 101, 'clothing',    120.00, 'paid', 2025),
      (5, 104, 'electronics',  89.00, 'paid', 2025);
  2. Create the first-level incremental materialized view to filter for paid orders.

    CREATE MATERIALIZED VIEW mv_paid_orders REFRESH FAST ON DEMAND AS
      SELECT order_id, user_id, product_category, amount, created_year
      FROM order_log
      WHERE status = 'paid';

    Perform a full refresh to populate the initial data.

    REFRESH MATERIALIZED VIEW mv_paid_orders;
  3. Create the second-level incremental materialized view to aggregate statistics by category, based on the first-level view.

    CREATE MATERIALIZED VIEW mv_category_revenue REFRESH FAST ON DEMAND AS
      SELECT product_category,
             COUNT(*)    AS order_count,
             SUM(amount) AS total_revenue
      FROM mv_paid_orders
      GROUP BY product_category;

    Perform a full refresh to populate the initial data.

    REFRESH MATERIALIZED VIEW mv_category_revenue;
  4. After the base table data changes, perform an incremental refresh for each level in order.

    -- Add new data to the base table
    INSERT INTO order_log VALUES
      (6, 105, 'electronics', 199.00, 'paid', 2025),
      (7, 106, 'clothing',     75.00, 'cancelled', 2025);
    
    -- Refresh in dependency order: first level, then second level
    REFRESH MATERIALIZED VIEW mv_paid_orders;
    REFRESH MATERIALIZED VIEW mv_category_revenue;
    
    -- Verify the result: order_id=7 is filtered out by the first level because its status is 'cancelled'
    -- and will not be included in the second-level statistics.
    SELECT * FROM mv_category_revenue;
Note

To ensure data consistency, you must refresh nested materialized views in their order of dependency, from the base view to the top-level view. The underlying materialized view must be an incremental refresh type (REFRESH FAST).

Perform an incremental refresh

To manually trigger a refresh, run the following command:

REFRESH MATERIALIZED VIEW mv_name;

An incremental refresh processes only the changes that have occurred in the base table since the last refresh, making it fast and low-overhead without requiring a full table scan.

Note

When you run REFRESH MATERIALIZED VIEW for the first time, the system automatically performs a full refresh to populate the initial data. Subsequent refreshes are incremental.

Drop a materialized view

DROP MATERIALIZED VIEW mv_name;

After you drop a materialized view, the system automatically cleans up the corresponding delta table if the delta_mv_auto_cleanup = ON parameter is enabled (default).

Management and monitoring

List of materialized views

SELECT
  table_name,
  table_schema,
  base_tables,
  first_refresh_time,
  refresh_strategy,
  last_start_time,
  last_end_time
FROM mysql.view_materialized_info
WHERE refresh_strategy = 'FAST';

All materialized views

SELECT * FROM mysql.view_materialized_info;

Refresh task queue

SELECT * FROM information_schema.materialized_view_refresh_queue;

FAQ

How to verify incremental refresh?

Use the following statement to verify that the refresh_strategy is FAST. You can also check the last_refresh_status column for the result of the most recent refresh.

SELECT table_name, refresh_strategy, last_start_time, last_end_time
FROM mysql.view_materialized_info;

When is the materialized view updated?

Before a refresh, the view's data reflects its state at the time of the last refresh. To update the data, you must manually run REFRESH MATERIALIZED VIEW mv_name.

Is the base table locked during refresh?

No. You can continue to perform normal read and write operations on the base table during an incremental refresh without any impact.

Is aggregation supported for incremental refresh?

Incremental refresh is currently supported for single-table aggregation. Multi-table aggregation and UNION ALL are not currently supported.

What is the minimum refresh interval?

The minimum refresh interval is currently 1 second.