Manage materialized views

更新时间:
复制 MD 格式

A materialized view is a precomputed result set storing the results of expensive operations, such as JOIN and AGGREGATE, to accelerate queries by preventing their re-execution.

Background

A view is a virtual table. Queries on a view are translated into new queries based on the SQL statement that defines the view. In contrast, a materialized view is a special physical table that stores actual data and uses storage resources. For more information about billing, see billing rules.

Materialized views are suitable for the following scenarios:

  • Frequent queries with a fixed pattern.

  • Queries with expensive operations, such as aggregations and joins.

  • Queries that involve only a small subset of the data in a table.

The following table compares traditional queries to those based on materialized views.

Item

Traditional query

Materialized view query

Query statements

Query data directly using SQL statements.

SELECT empid, deptname  
FROM emps JOIN depts 
ON emps.deptno=depts.deptno 
WHERE hire_date >= '2018-01-01';

Create a materialized view, and then query data from the materialized view.

The following statement creates a materialized view:

CREATE MATERIALIZED VIEW mv 
AS SELECT empid, deptname, hire_date  
FROM emps JOIN depts 
ON emps.deptno=depts.deptno 
WHERE hire_date >= '2016-01-01';

Query data from the materialized view:

SELECT empid, deptname FROM mv 
WHERE hire_date >= '2018-01-01';

If query rewrite is enabled, you can query the original source tables. MaxCompute automatically rewrites the query to use the materialized view instead:

SELECT empid, deptname 
FROM emps JOIN depts 
ON emps.deptno=depts.deptno 
WHERE hire_date >= '2018-01-01';
-- This is equivalent to the following statement.
SELECT empid, deptname FROM mv 
WHERE hire_date >= '2018-01-01';

Query characteristics

Queries involve table reads, join operations, and filtering (WHERE clause). On large source tables, these complex and inefficient operations result in slow query performance.

Queries involve table reads and filtering operations, eliminating join operations. MaxCompute automatically selects the optimal materialized view and reads precomputed data from it. This significantly improves query performance.

Commands

Type

Description

Required role

Entry points

Create materialized view (supports partitioning and clustering)

Creates a materialized view based on a query.

Users with the CreateTable permission on the project.

Run these commands on the following platforms:

Update materialized view

Updates an existing materialized view.

Users with the Alter permission.

Modify materialized view lifecycle

Modifies the lifecycle of an existing materialized view.

Users with the Alter permission.

Enable or disable materialized view lifecycle

Enables or disables the lifecycle of an existing materialized view.

Users with the Alter permission.

Query materialized view information

Queries basic information about a materialized view.

Users with the Describe permission.

Query materialized view status

Checks whether a materialized view is valid or invalid.

Users with the Describe permission.

List materialized views

Lists all materialized views in a project, or only those matching specific criteria.

Users with the List permission on the project.

Delete materialized view

Deletes an existing materialized view.

Users with the Drop permission.

Delete materialized view partition

Deletes a partition from an existing materialized view.

Users with the Drop permission.

Materialized view query passthrough

Automatically queries the base partitioned table if data for the queried partition is not in the materialized view.

Users with the Write and CreateTable permissions on the project.

Materialized view query rewrite

Rewrites a query to use a materialized view.

Users with the Write and CreateTable permissions on the project.

Schedule materialized view updates

Schedules data updates for a materialized view.

Users with the Alter permission.

Limits

Window functions, UDTFs, and non-deterministic functions (such as UDFs and UDAFs) are not supported.

Note

If you must use non-deterministic functions, set the following property at the session level: set odps.sql.materialized.view.support.nondeterministic.function=true;.

Create a partitioned and clustered materialized view

You can create a materialized view with partitioning and clustering properties.

  • Limitations

    • A materialized view cannot have the same name as an existing table, view, or another materialized view in the same project. You can run the SHOW TABLES; command to view the names of all tables and materialized views in the project.

    • You cannot create a materialized view based on an existing materialized view.

    • You cannot create a materialized view based on an external table.

  • Usage notes

    • If the defining query fails, the materialized view will not be created.

    • The partition key columns of a materialized view must be derived from a source table. The number and order of partition key columns must match those of the source table. The column names can be different.

    • You must specify comments for all columns, including partition key columns. If you specify comments for only some columns, the operation returns an error.

    • You can specify both partitioning and clustering properties for a materialized view. In this case, the data in each partition is clustered accordingly.

    • If the query statement contains unsupported operators, it returns an error. For a list of supported operators, see materialized view query rewrite.

    • By default, MaxCompute does not support creating materialized views using non-deterministic functions, such as UDFs and UDAFs. If your use case requires a non-deterministic function, set the following property at the session level: set odps.sql.materialized.view.support.nondeterministic.function=true;.

    • A materialized view can have empty partitions. If a partition in a source table is empty, refreshing the materialized view automatically creates a corresponding empty partition.

  • Syntax

    CREATE MATERIALIZED VIEW [IF NOT EXISTS][project_name.]<mv_name>
    [LIFECYCLE <days>]    -- Specifies the lifecycle.
    [BUILD DEFERRED]    -- Creates only the schema and does not populate data.
    [(<col_name> [COMMENT <col_comment>],...)]    -- Specifies column comments.
    [DISABLE REWRITE]    -- Specifies whether to disable query rewrite.
    [COMMENT 'table comment']    -- Specifies a table comment.
    [PARTITIONED BY (<col_name> [, <col_name>, ...])    -- Defines the materialized view as a partitioned table.
    [CLUSTERED BY|RANGE CLUSTERED BY (<col_name> [, <col_name>, ...])
    [SORTED BY (<col_name> [ASC | DESC] [, <col_name> [ASC | DESC] ...])]
        INTO <number_of_buckets> BUCKETS]    -- Sets the shuffle and sort properties for a clustered table.
    [REFRESH EVERY <num> MINUTES/HOURS/DAYS] 
    [TBLPROPERTIES("compressionstrategy"="normal/high/extreme",    -- Specifies the compression policy for data storage.
                    "enable_auto_substitute"="true",    -- Specifies whether to query data from the source partitioned table if a partition does not exist.
                    "enable_auto_refresh"="true",    -- Specifies whether to enable automatic refresh.
                    "refresh_interval_minutes"="120",    -- Specifies the refresh interval.
                    "only_refresh_max_pt"="true"    -- For a partitioned materialized view, refreshes only the latest partition of the source table.
                    )]
    AS <select_statement>;
  • Parameters

    Parameter

    Required

    Description

    IF NOT EXISTS

    No

    If IF NOT EXISTS is not specified and the materialized view already exists, an error is returned.

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view that you want to create.

    days

    No

    Specifies the lifecycle of the materialized view in days. Valid values: 1 to 37231.

    BUILD DEFERRED

    No

    If you specify this keyword, MaxCompute creates only the schema of the materialized view and does not populate it with data.

    col_name

    No

    The name of a column in the materialized view.

    col_comment

    No

    The comment of a column in the materialized view.

    DISABLE REWRITE

    No

    Disables materialized view query rewrite. By default, query rewrite is enabled. You can run the ALTER MATERIALIZED VIEW [project_name.]<mv_name> DISABLE REWRITE; command to disable query rewrite or run the ALTER MATERIALIZED VIEW [project_name.]<mv_name> ENABLE REWRITE; command to enable it.

    PARTITIONED BY

    No

    Specifies the partition key columns. This clause creates a partitioned table.

    CLUSTERED BY|RANGE CLUSTERED BY

    No

    Sets the shuffle property for a clustered table.

    SORTED BY

    No

    Sets the sort property for a clustered table.

    REFRESH EVERY

    No

    Sets the scheduled refresh interval for the materialized view. Valid units: MINUTES, HOURS, and DAYS.

    number_of_buckets

    No

    Sets the number of buckets for a clustered table.

    TBLPROPERTIES

    No

    • compressionstrategy: Specifies the compression policy for data storage. Valid values: normal, high, and extreme. enable_auto_substitute: Specifies whether to query the source partitioned table if a partition does not exist in the materialized view. For more information, see materialized view query rewrite.

    • enable_auto_refresh: Optional. Set this property to true to enable automatic data refresh.

    • refresh_interval_minutes: Conditionally required. If enable_auto_refresh is set to true, you must specify this interval.

    • only_refresh_max_pt: Optional. This property applies only to partitioned materialized views. If you set this property to true, MaxCompute refreshes only the latest partition of the source table.

    select_statement

    Yes

    The SELECT statement that defines the materialized view. For more information about the format, see SELECT Syntax.

  • Examples

    • Example 1: Create a materialized view.

      1. Create and populate two tables, mf_t and mf_t1.

        CREATE TABLE IF NOT EXISTS mf_t( 
             id     bigint, 
             value   bigint, 
             name   string) 
        PARTITIONED BY (ds STRING); 
        
        ALTER TABLE mf_t ADD PARTITION (ds='1'); 
        INSERT INTO mf_t PARTITION (ds='1') VALUES (1,10,'kyle'),(2,20,'xia'); 
        SELECT * FROM mf_t WHERE ds ='1'; 
        -- The following result is returned:
        +------------+------------+------------+------------+
        | id         | value      | name       | ds         |
        +------------+------------+------------+------------+
        | 1          | 10         | kyle       | 1          |
        | 2          | 20         | xia        | 1          |
        +------------+------------+------------+------------+
        
        CREATE TABLE IF NOT EXISTS mf_t1( 
             id     bigint, 
             value   bigint, 
             name   string) 
        PARTITIONED BY (ds STRING); 
        
        ALTER TABLE mf_t1 ADD PARTITION (ds='1'); 
        INSERT INTO mf_t1 PARTITION (ds='1') VALUES (1,10,'kyle'),(3,20,'john'); 
        SELECT * FROM mf_t1 WHERE ds ='1';
        -- The following result is returned:
        +------------+------------+------------+------------+
        | id         | value      | name       | ds         |
        +------------+------------+------------+------------+
        | 1          | 10         | kyle       | 1          |
        | 3          | 20         | john       | 1          |
        +------------+------------+------------+------------+
      2. Create a materialized view.

        • Sample 1: Create a materialized view that is partitioned by the ds column.

          CREATE MATERIALIZED VIEW mf_mv LIFECYCLE 7 
          (
            key comment 'unique id',
            value comment 'input value',
            ds comment 'partitiion'
            )
          PARTITIONED BY (ds)
          AS SELECT t1.id AS key, t1.value AS value, t1.ds AS ds
               FROM mf_t AS t1 JOIN mf_t1 AS t2
                 ON t1.id = t2.id AND t1.ds=t2.ds AND t1.ds='1';
          -- Query the materialized view.
          SELECT * FROM mf_mv WHERE ds =1;
          +------------+------------+------------+
          | key        | value      | ds         |
          +------------+------------+------------+
          | 1          | 10         | 1          |
          +------------+------------+------------+
        • Sample 2: Create a non-partitioned materialized view that is clustered.

          CREATE MATERIALIZED VIEW mf_mv2 LIFECYCLE 7 
          CLUSTERED BY (key) SORTED BY (value) INTO 1024 buckets 
          AS  SELECT t1.id AS key, t1.value AS value, t1.ds AS ds 
                FROM mf_t AS t1 JOIN mf_t1 AS t2 
                  ON t1.id = t2.id AND t1.ds=t2.ds AND t1.ds='1';
        • Sample 3: Create a partitioned materialized view that is clustered.

          CREATE MATERIALIZED VIEW mf_mv3 LIFECYCLE 7 
          PARTITIONED BY (ds) 
          CLUSTERED BY (key) SORTED BY (value) INTO 1024 buckets 
          AS  SELECT t1.id AS key, t1.value AS value, t1.ds AS ds 
                FROM mf_t AS t1 JOIN mf_t1 AS t2 
                  ON t1.id = t2.id AND t1.ds=t2.ds AND t1.ds='1';
    • Example 2: Automatically creating an empty partition by refreshing a materialized view

      CREATE TABLE mf_blank_pts(id bigint ,name string) PARTITIONED BY (ds bigint); 
      ALTER TABLE mf_blank_pts ADD PARTITION (ds = 1); 
      ALTER TABLE mf_blank_pts ADD PARTITION (ds = 2); 
      INSERT INTO TABLE mf_blank_pts PARTITION(ds=1) VALUES (1,"aba"),(2,"cbd");  
       
      CREATE MATERIALIZED VIEW IF NOT EXISTS mf_mv_blank_pts PARTITIONED BY (ds) 
      AS SELECT id,name,ds FROM mf_blank_pts;
      
      ALTER MATERIALIZED VIEW mf_mv_blank_pts REBUILD PARTITION (ds>0); 
      
      SHOW PARTITIONS mf_mv_blank_pts; 
      -- The ds=2 partition in the source table is empty. Refreshing the partitions where ds > 0 creates a corresponding empty partition in the materialized view.
      ds=1
      ds=2
      
      SELECT * FROM mf_mv_blank_pts WHERE ds>0; 
      -- The result shows that the ds=2 partition is empty, while the ds=1 partition contains data.
      +------------+------------+------------+
      | id         | name       | ds         |
      +------------+------------+------------+
      | 1          | aba        | 1          |
      | 2          | cbd        | 1          |
      +------------+------------+------------+

Update a materialized view

Performing an insert, overwrite, update, or delete operation on a materialized view's source table or partition invalidates the view and makes it unusable for query rewrite. You can check the status of a materialized view and update it if it is invalid. For instructions on how to check the status, see Query materialized view information.

  • Usage notes

    • An update operation refreshes only the source tables or partitions with changed data.

    • You can enable scheduled updates to refresh the materialized view periodically. For more information, see Scheduled materialized view updates.

  • Syntax

    ALTER MATERIALIZED VIEW [<project_name>.]<mv_name>
          REBUILD [PARTITION (<ds>=max_pt(<table_name>),<expression1>...)];
  • Parameters

    Parameter

    Required

    Description

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view to update.

    ds

    No

    The name of the partition field in the materialized view.

    max_pt

    No

    A function that returns the latest partition of the specified table_name. The table_name can be a table or a materialized view.

    expression

    No

    An expression that filters which partitions to update. For example, ds>='1'. Use this to update a subset of partitions.

  • Examples

    • Example 1: Update a non-partitioned materialized view.

      -- Create a non-partitioned table.
      CREATE TABLE count_test(a BIGINT, b BIGINT); 
      -- Create a non-partitioned materialized view.
      CREATE MATERIALIZED VIEW count_mv LIFECYCLE 7 AS SELECT COUNT(*) FROM count_test; 
      -- Update the non-partitioned materialized view. 
      ALTER MATERIALIZED VIEW count_mv rebuild; 
    • Example 2: Update a specific partition of a partitioned materialized view.

      ALTER MATERIALIZED VIEW mf_mv_blank_pts REBUILD PARTITION (ds='1');
    • Example 3: Update partitions of a partitioned materialized view that meet specified conditions.

      ALTER MATERIALIZED VIEW mf_mv_blank_pts REBUILD PARTITION (ds>='1', ds<='2');
    • Example 4: Update the latest partition data of a partitioned materialized view.

      -- Create a sample partitioned table.
      CREATE TABLE IF NOT EXISTS sale_detail_jt 
      (shop_name STRING , 
      customer_id STRING , 
      total_price DOUBLE ) 
      PARTITIONED BY (sale_date STRING ,region STRING );
      
      ALTER TABLE  sale_detail_jt ADD PARTITION (sale_date='2013',region='china');
      
      INSERT INTO  sale_detail_jt PARTITION (sale_date='2013',region='china') VALUES 
          ('s1','c1',100.1), 
          ('s2','c2',100.2), 
          ('s3','c3',100.3);
      
      ALTER TABLE  sale_detail_jt ADD PARTITION (sale_date='2013',region='en');
      INSERT INTO  sale_detail_jt PARTITION (sale_date='2013',region='en') VALUES 
          ('t1','c5',200.0), 
          ('t2','c6',300.0);
          
      -- View partition data.
      SELECT * FROM sale_detail_jt WHERE sale_date='2013' AND region='china';
      +-----------+-------------+-------------+-----------+--------+
      | shop_name | customer_id | total_price | sale_date | region |
      +-----------+-------------+-------------+-----------+--------+
      | s1        | c1          | 100.1       | 2013      | china  |
      | s2        | c2          | 100.2       | 2013      | china  |
      | s5        | c2          | 100.2       | 2013      | china  |
      +-----------+-------------+-------------+-----------+--------+
      -- View partition data.
      SELECT * FROM sale_detail_jt WHERE sale_date='2013' AND region='en';
      +-----------+-------------+-------------+-----------+--------+
      | shop_name | customer_id | total_price | sale_date | region |
      +-----------+-------------+-------------+-----------+--------+
      | t1        | c5          | 200.0       | 2013      | en     |
      | t2        | c6          | 300.0       | 2013      | en     |
      +-----------+-------------+-------------+-----------+--------+
      
      -- Create a materialized view.
      CREATE MATERIALIZED VIEW mv_deferred BUILD DEFERRED AS SELECT * FROM sale_detail_jt;
      
      -- Query the mv_deferred materialized view.
      SELECT * FROM mv_deferred;
      -- The following result is returned:
      +-----------+-------------+-------------+-----------+--------+
      | shop_name | customer_id | total_price | sale_date | region |
      +-----------+-------------+-------------+-----------+--------+
      +-----------+-------------+-------------+-----------+--------+
      
      -- Create a partitioned table.
      CREATE TABLE mf_part (id bigint,name string) PARTITIONED BY (dt string);
      -- Insert data.
      INSERT INTO mf_part PARTITION(dt='2013') VALUES(1,'name1'),(2,'name2');
      -- Query the data.
      SELECT * FROM mf_part WHERE dt='2013';
      -- The following result is returned:
      +------------+------+----+
      | id         | name | dt |
      +------------+------+----+
      | 1          | name1 | 2013 |
      | 2          | name2 | 2013 |
      +------------+------+----+
      
      -- Create a partitioned materialized view.
      CREATE MATERIALIZED VIEW mv_rebuild BUILD DEFERRED PARTITIONED BY (dt) AS SELECT * FROM mf_part;
      
      -- Query data from the materialized view.
      SELECT * FROM mv_rebuild WHERE dt='2013';
      -- The following result is returned:
      +------------+------+----+
      | id         | name | dt |
      +------------+------+----+
      +------------+------+----+
      -- Refresh the latest partition data.
      ALTER MATERIALIZED VIEW mv_rebuild REBUILD PARTITION(dt=max_pt('mf_part'));
      
      -- Query data from the materialized view.
      SELECT * FROM mv_rebuild WHERE dt='2013'; 
      -- The following result is returned:
      +------------+------+----+
      | id         | name | dt |
      +------------+------+----+
      | 1          | name1 | 2013 |
      | 2          | name2 | 2013 |
      +------------+------+----+

Modify the lifecycle of a materialized view

This statement modifies the lifecycle of an existing materialized view.

  • Syntax

    ALTER MATERIALIZED VIEW [<project_name>.]<mv_name> SET LIFECYCLE <days>;
  • Parameters

    Parameter

    Required

    Description

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view to modify.

    days

    Yes

    The lifecycle of the materialized view, in days.

  • Example

    -- Set the lifecycle to 10 days.
    ALTER MATERIALIZED VIEW count_mv SET LIFECYCLE 10;

Enable or disable materialized view lifecycle

This statement enables or disables the lifecycle of a materialized view.

  • Syntax

    ALTER MATERIALIZED VIEW  [<project_name>.]<mv_name> [<pt_spec>] enable|disable LIFECYCLE;
  • Parameters

    Parameter

    Required

    Description

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view for which to enable or disable the lifecycle.

    pt_spec

    No

    The partition of the materialized view for which to enable or disable the lifecycle. The format is (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...), where partition_col is the partition field and partition_col_value is the partition value.

    enable|disable

    Yes

    enable enables the lifecycle. disable disables the lifecycle. If disabled, the lifecycle policy no longer applies to the partition or table.

  • Examples

    • Example 1: Enable lifecycle management for a materialized view.

      ALTER MATERIALIZED VIEW mf_mv_blank_pts  PARTITION (ds='1') enable LIFECYCLE;
    • Example 2: Disable lifecycle management for a materialized view.

      ALTER MATERIALIZED VIEW mf_mv_blank_pts  PARTITION (ds='1') disable LIFECYCLE;

Describe a materialized view

Displays information about a materialized view, including its schema and last modification time.

  • Syntax

    DESC EXTENDED [<project_name>.]<mv_name>;
  • Parameters

    Parameter

    Required

    Description

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view.

  • Example

    DESC EXTENDED mv;

    The command returns the following:

    Note

    This output format requires MaxCompute client v0.43 or later. For more information, see Connect using the local client (odpscmd).

    +------------------------------------------------------------------------------------+
    | Owner:                    ALIYUN$$****@***.aliyunid.com                       |
    | Project:                  m****                                                |
    | TableComment:                                                                      |
    +------------------------------------------------------------------------------------+
    | CreateTime:               2023-05-30 13:16:07                                      |
    | LastDDLTime:              2023-05-30 13:16:07                                      |
    | LastModifiedTime:         2023-05-30 13:16:07                                      |
    +------------------------------------------------------------------------------------+
    | MaterializedView: YES                                                              |
    | ViewText: select id,name from mf_refresh                                           |
    | Rewrite Enabled: true                                                              |
    | AutoRefresh Enabled: true                                                          |
    | Refresh Interval Minutes: 10                                                       |
    +------------------------------------------------------------------------------------+
    | Native Columns:                                                                    |
    +------------------------------------------------------------------------------------+
    | Field    | Type   | Label | ExtendedLabel | Nullable | DefaultValue | Comment      |
    +------------------------------------------------------------------------------------+
    | id       | bigint |       |               | true     | NULL         |              |
    | name     | string |       |               | true     | NULL         |              |
    +------------------------------------------------------------------------------------+
    | Extended Info:                                                                     |
    +------------------------------------------------------------------------------------+
    | IsOutdated:               false                                                    |
    | TableID:                  569ec712873e44b3868e79b7a8beabab                         |
    | IsArchived:               false                                                    |
    | PhysicalSize:             1875                                                     |
    | FileNum:                  2                                                        |
    | StoredAs:                 CFile                                                    |
    | CompressionStrategy:      normal                                                   |
    | odps.timemachine.retention.days: 1                                                        |
    | ColdStorageStatus:        N/A                                                      |
    | encryption_enable:        false                                                    |
    +------------------------------------------------------------------------------------+
    | AutoRefresh History:                                                               |
    +------------------------------------------------------------------------------------+
    | InstanceId                | Status     | StartTime           | EndTime             |
    +------------------------------------------------------------------------------------+
    | 20230619070546735ghwl1****** | TERMINATED | 2023-06-19 15:05:46 | 2023-06-19 15:05:47 |
    | 20230619065545586gwllc****** | TERMINATED | 2023-06-19 14:55:45 | 2023-06-19 14:55:46 |
    | 20230619064544463gcjgom****** | TERMINATED | 2023-06-19 14:45:44 | 2023-06-19 14:45:45 |
    | 20230619063543334gzxs2d****** | TERMINATED | 2023-06-19 14:35:43 | 2023-06-19 14:35:44 |
    | 2023061906254257gi21w2****** | TERMINATED | 2023-06-19 14:25:42 | 2023-06-19 14:25:43 |
    | 20230619061540813giacg8****** | TERMINATED | 2023-06-19 14:15:41 | 2023-06-19 14:15:41 |
    | 20230619060539674gswjq9****** | TERMINATED | 2023-06-19 14:05:39 | 2023-06-19 14:05:40 |
    | 20230619055538578gvdjk****** | TERMINATED | 2023-06-19 13:55:38 | 2023-06-19 13:55:40 |
    | 20230619054537356glqdne****** | TERMINATED | 2023-06-19 13:45:37 | 2023-06-19 13:45:38 |
    | 2023061905353687gcc5pl****** | TERMINATED | 2023-06-19 13:35:36 | 2023-06-19 13:35:37 |
    +------------------------------------------------------------------------------------+

Check materialized view status

Check the status of a materialized view to monitor changes in the source table and ensure the view is valid. A materialized view can be in one of the following two states:

  • Valid

    When a query runs, MaxCompute reads data directly from the materialized view instead of the source table.

  • Invalid

    When a query runs, MaxCompute cannot read data directly from the materialized view. Instead, it reads from the source table, and query acceleration is not applied.

Use the following function to check if a materialized view is valid.

  • Syntax

    boolean materialized_view_is_valid(<mv_name>,<partition_value>);
  • Examples

    This function checks whether the data in a materialized view is consistent with the latest data in its source table. It returns true if the data is consistent and false otherwise.

    • SELECT materialized_view_is_valid("count_mv");
    • SELECT materialized_view_is_valid("mf_mv_blank_pts","1");

List materialized views

Lists all materialized views in a project, or only those matching a specific pattern.

Note

The SHOW MATERIALIZED VIEWS command requires MaxCompute client (odpscmd) version 0.43.0 or later.

  • Syntax

    -- List all materialized views in the project.
    SHOW MATERIALIZED VIEWS;
    -- List the materialized views in the project whose names match a specific pattern.
    SHOW MATERIALIZED VIEWS LIKE '<materialized_view>';
  • Example

    -- List the materialized views whose names match the 'test*' pattern. The asterisk (*) is a wildcard.
    SHOW MATERIALIZED VIEWS LIKE 'test*';         

    The command returns the following output.

    ALIYUN$account_name:test_two_mv
    ALIYUN$account_name:test_create_one_mv

Drop a materialized view

Drops an existing materialized view.

  • Syntax

    DROP MATERIALIZED VIEW [IF EXISTS] [<project_name>.]<mv_name> [purge];
  • Parameters

    Parameter

    Required

    Description

    IF EXISTS

    No

    If you omit IF EXISTS and the specified materialized view does not exist, the operation returns an error.

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the materialized view to drop.

    purge

    No

    If you specify purge, the data in the materialized view is also deleted.

  • Examples

    • Drop the materialized view count_mv.

      DROP MATERIALIZED VIEW count_mv;
    • Drop the materialized view count_mv and its data.

      DROP MATERIALIZED VIEW count_mv purge;

Drop materialized view partitions

Drops one or more partitions from an existing partitioned materialized view.

  • Syntax

    ALTER MATERIALIZED VIEW [<project_name>.]<mv_name> DROP [IF EXISTS] PARTITION <pt_spec> [PARTITION <pt_spec>, PARTITION <pt_spec>....];
  • Parameters

    Parameter

    Required

    Description

    project_name

    No

    The name of the MaxCompute project to which the materialized view belongs. If you do not specify this parameter, the current project is used. You can log on to the MaxCompute console, select a region in the upper-left corner, and then navigate to the Workspace > Projects page to find the project name.

    mv_name

    Yes

    The name of the partitioned materialized view from which to drop partitions.

    IF EXISTS

    No

    If IF EXISTS is not specified and the materialized view does not exist, an error is returned.

    pt_spec

    Yes

    You must specify at least one partition to delete. The format is (partition_col1 = partition_col_value1, partition_col2 = partition_col_value2, ...), where partition_col is a partition field and partition_col_value is a partition value.

  • Examples

    • Example 1: Drop a partition from a partitioned materialized view. Sample statement:

      ALTER MATERIALIZED VIEW mf_mv_blank_pts DROP PARTITION (ds='1');
    • Example 2: Drop the partitions that meet specific conditions from a partitioned materialized view. Sample statement:

      ALTER MATERIALIZED VIEW mf_mv_blank_pts DROP PARTITION (ds>='1' AND ds<='2');

Materialized view penetration query

A partitioned materialized view may not contain data for all its partitions, especially if only the latest partitions have been refreshed. When users query data, they might not know if the data for all requested partitions is available in the materialized view. If a query requests data from a partition that has not been materialized, the system can automatically retrieve the data from the source table. This process is called a penetration query. The following figure illustrates this process.

查询透穿图示

To enable the penetration query feature for a materialized view, configure the following property:

When creating the materialized view, add "enable_auto_substitute"="true" in the tblproperties clause.

The following example shows how to use a penetration query with a materialized view.

  1. Create a partitioned materialized view that supports penetration queries.

    -- Create the source table src.
    CREATE TABLE src(id bigint,name string) PARTITIONED BY (dt string);
    -- Insert data.
    INSERT INTO src PARTITION(dt='20210101') VALUES(1,'Alex');
    INSERT INTO src PARTITION(dt='20210102') VALUES(2,'Flink');
    
    -- Create a partitioned materialized view that supports penetration queries.
    CREATE MATERIALIZED VIEW IF NOT EXISTS mv LIFECYCLE 7 
    PARTITIONED BY (dt) 
    tblproperties("enable_auto_substitute"="true") 
    AS SELECT id, name, dt FROM src;
  2. Query data from the 20210101 partition. This query hits the materialized view because the data for this partition has been materialized.

    SELECT * FROM mv WHERE dt='20210101';
  3. Query the 20210102 partition from the materialized view mv. Since this partition is not materialized, the query automatically penetrates to the source table src to retrieve the data.

    SELECT * FROM mv WHERE dt = '20210102';
    -- Because the data for the '20210102' partition is not materialized, the query is rewritten to access the source table. This is equivalent to the following statement:
    SELECT * FROM (SELECT id, name, dt FROM src WHERE dt='20210102') t;
  4. Query a range of partitions from mv. The system fetches data for non-materialized partitions from the source table, combines it with data from materialized partitions using UNION ALL, and then returns the final result.

    SELECT * FROM mv WHERE dt >= '20201230' AND dt<='20210102' AND id=5; 
    -- Because the data for the '20201230' and '20210102' partitions is not materialized, the query is rewritten to access the source table. This is equivalent to the following statement:
    SELECT * FROM
    (SELECT id, name, dt FROM src WHERE dt='20201230' OR dt='20210102'
     UNION ALL  
     SELECT * FROM mv WHERE dt='20210101'
    ) t WHERE id = 5;

Billing

Fees for materialized views consist of the following two parts:

  • Storage fees

    Materialized views use physical storage and incur storage fees.

    For both subscription and pay-as-you-go Standard Edition projects, MaxCompute charges storage fees on a pay-as-you-go basis. For more pricing information, see Storage fees.

  • Computing costs

    You incur computing costs when you create, update, or query materialized views. Performing a query rewrite on a valid materialized view also incurs computing costs.

    • Subscription Standard Edition projects do not incur separate computing costs.

    • For pay-as-you-go Standard Edition projects, computing costs depend on the SQL complexity and the amount of input data. For more information, see Pay-as-you-go for SQL jobs. Note the following:

      • The SQL statements for updating a materialized view are the same as those for creating one. If the project that contains the materialized view is bound to a subscription computing resource group, the operation uses your purchased resources and incurs no extra costs. If the project is bound to a pay-as-you-go computing resource group, the cost depends on the amount of input data and the complexity of the SQL statements. After a materialized view is refreshed, you also incur storage fees based on its actual data size.

      • When a materialized view is valid, query rewrite reads data from the materialized view. The query's input data is measured by the data scanned from the materialized view, not the source tables. When a materialized view is invalid, query rewrite is not supported, and the query directly scans the source tables. For more information about how to check the view status, see Query the status of a materialized view.

      • Due to factors such as data bloat from multi-table joins, the amount of data read from a materialized view is not always less than that from its source tables. Therefore, MaxCompute cannot guarantee that querying a materialized view is less expensive than querying its source tables.

Related documentation

To learn more about materialized view query rewrite, see Materialized view query rewrite.

To learn more about the scheduled refresh of materialized views, see Scheduled refresh of materialized views.