Dynamic Table Query Rewrite-Hologres(Hologres)-阿里云帮助中心

Query rewrite

In big data and data warehousing scenarios, detail tables are often massive, containing hundreds of millions or tens of billions of rows. Business and analytical queries heavily rely on multi-dimensional GROUP BY aggregations on these tables, such as calculating daily or hourly order volumes and GMV by city, or tracking PV/UV and conversion rates by channel or device. Running these aggregations directly on a detail table leads to several problems:

High aggregation costs: Each query scans and aggregates large portions or the entire detail table, consuming significant CPU and I/O resources.
Heavy load on the detail table: This can affect other tasks in the same database and may require frequent scaling.

Hologres Dynamic Tables provide query rewrite capabilities. If a base table has been pre-aggregated by a Dynamic Table, the optimizer can automatically rewrite an aggregation query on the base table to query the Dynamic Table instead, if certain conditions are met. This approach skips the expensive aggregation computation and offers several key benefits:

Reduces aggregation load on detail tables: Frequently used metrics such as order counts, GMV, and PV/UV can be read directly from the pre-aggregated results in a Dynamic Table, minimizing repetitive scans and aggregations on the detail table.
Significantly improves query response times: For reports, self-service analytics, and interactive queries that hit a Dynamic Table, aggregation is substantially reduced, lowering latency. The experience is similar to querying a wide table.
Transparent to upstream users: Data analysts and application developers can continue to query the base table without any changes. The data warehouse or platform team can design and maintain Dynamic Tables, making performance optimizations invisible to users.

Hologres Dynamic Table query rewrite is well-suited for the following use cases:

Real-time and near-real-time operational dashboards and monitoring.
Multi-dimensional BI analysis and self-service data access.
Acceleration for a unified core metrics system, such as GMV, order volume, and active user counts.

Usage and limitations

Version requirement: This feature requires Hologres V4.1 or later.
Query consistency: Query rewrite uses the data from the most recent refresh of the Dynamic Table. This means the data lags behind the latest state of the base table, resulting in weak consistency.
Base table limitations:
- Supported base table types: Hologres internal tables, Paimon foreign tables (created as Foreign Tables), and MaxCompute foreign tables (created as Foreign Tables).
- If the base table is a Hologres partitioned table, physical partitions are not supported. However, the base table can use logical partitions.
- External Tables are not supported.
Dynamic Table type limitations:
- Supported: non-partitioned Dynamic Tables and logically partitioned Dynamic Tables.
- Not supported: physically partitioned Dynamic Tables and External Dynamic Tables.
Limitations on the query definition in a Dynamic Table:
- Only single-table queries are currently supported.
- Aggregate functions with a FILTER clause, such as sum(x) FILTER (WHERE ...), are not supported.
- The query cannot compute new columns from aggregation results in the SELECT list, such as sum(x)/count(x).

Enable and configure query rewrite

Recommendation : This feature is suitable for dashboards, monitoring, and analytics scenarios that can tolerate latency of seconds to minutes. For scenarios requiring strong real-time guarantees or strict data reconciliation, we recommend querying the base table directly or using other strong consistency solutions.

Enable query rewrite

When you query a base table, set the hg_enable_query_rewrite GUC parameter to control whether the query can use query rewrite.

Note

We do not recommend enabling this feature at the database level, as it may cause performance degradation.

-- Enable query rewrite (session level)
SET hg_enable_query_rewrite = on;
-- Set at the database level (not recommended)
ALTER DATABASE <db_name> SET hg_enable_query_rewrite = on;

Enable query rewrite for a Dynamic Table

When you create a Dynamic Table, use the allowed_to_rewrite_query property to control whether it can be used for query rewrite. By default, this property is set to 'false', and the table is not used for query rewrite.

CREATE [ OR REPLACE ] DYNAMIC TABLE [ IF NOT EXISTS ] [<schema_name>.]<table_name> (
  [col_name],
  [col_name],
  [col_name]
)
[LOGICAL PARTITION BY LIST(<partition_key>)]
WITH (
  ...,
  allowed_to_rewrite_query = '[true | false]',
  ...
)
AS
<query>;

Parameter description:

allowed_to_rewrite_query: Specifies whether this Dynamic Table is a candidate for query rewrite.
- 'true': Allows the table to be used for query rewrite.
- 'false': The default value. The table is not used for query rewrite.

Recommendations:

For Dynamic Tables created specifically to accelerate aggregation queries, set this property to 'true'.
For Dynamic Tables with complex definitions that cannot be used by the current rewrite rules, set this property to 'false' to reduce unnecessary optimizer overhead.

Modify the query rewrite property

You can use ALTER DYNAMIC TABLE ... SET to change whether a Dynamic Table can be used for query rewrite:

ALTER DYNAMIC TABLE [IF EXISTS] [<schema_name>.]<table_name>
SET (allowed_to_rewrite_query = '[true | false]');

Control candidate Dynamic Tables

If multiple Dynamic Tables are available, you can specify a set of candidate tables in a hint to narrow the optimizer's search scope and control priority. For more information on hint syntax, see HINT.

SELECT /*+HINT query_rewrite_candidates(<schema.dt_name1> <schema.dt_name2> ...) */
  ...
FROM ...;

Usage notes:

If there are multiple Dynamic Tables, separate them with spaces.
You can include the schema name.

Example:

-- Only allow dt_sales to be used for query rewrite
SELECT /*+HINT query_rewrite_candidates(dt_sales) */
  day, hour, min(amount), max(amount)
FROM base_sales_table
GROUP BY day, hour;

Supported features

The current version supports query rewrite for single-table aggregation in three main patterns:

Transparent rewrite with matching aggregation dimensions.
Rollup aggregation (aggregating on a subset of the Dynamic Table's dimensions).
Rollup aggregation with filter conditions.

Matching aggregation dimensions

Conditions:

The GROUP BY dimensions in the query exactly match the GROUP BY dimensions in the Dynamic Table definition.
The aggregate functions in the query can be satisfied by the aggregated columns in the Dynamic Table.
Any type of aggregate function, including DISTINCT, is supported, if a corresponding result column exists in the Dynamic Table.

Example:

-- Create the base table
CREATE TABLE base_sales_table(
    day text not null,
    hour int,
    amount int
);
-- Insert data
INSERT INTO base_sales_table
VALUES  ('20250529', 12, 1),
        ('20250529', 12, 2),
        ('20250529', 12, 2),
        ('20250529', 13, 3),
        ('20250530', 13, 4),
        ('20250530', 14, 5),
        ('20250531', 14, 6);
-- Create the Dynamic Table
CREATE DYNAMIC TABLE dt_sales
WITH (
    freshness = '1 minutes',
    auto_refresh_mode='incremental',
    auto_refresh_enable='false',
    allowed_to_rewrite_query='true'
)
AS
SELECT
    day,
    hour,
    min(amount),
    max(amount),
    sum(amount),
    count(amount),
    count(*) as rows,
    count(1) as rows1,
    count(distinct amount) as cd
FROM base_sales_table
GROUP BY day, hour;
-- Manually refresh the Dynamic Table
REFRESH TABLE dt_sales;

Query example: When the aggregation dimensions match, the execution plan shows that the query on the base table is rewritten to query the Dynamic Table.

-- The query dimensions match the Dynamic Table
EXPLAIN SELECT day, hour, min(amount), max(amount) FROM base_sales_table GROUP BY day, hour;

The returned execution plan shows that the query has been rewritten to perform a sequential scan on the Dynamic Table dt_sales:

QUERY PLAN
Gather  (cost=0.00..5.00 rows=7 width=20)
  -> Local Gather  (cost=0.00..5.00 rows=7 width=20)
    -> Project  (cost=0.00..5.00 rows=7 width=20)
      -> Seq Scan on dt_sales  (cost=0.00..5.00 rows=5 width=20)
Query Queue: init_warehouse.default_queue
Optimizer: HQO version 4.1.0

Rollup aggregation

Conditions:

The GROUP BY dimensions of the Dynamic Table are a superset of the GROUP BY dimensions in the query. In other words, the Dynamic Table is aggregated at a finer granularity.
The aggregate functions in the query can be derived by aggregating the columns in the Dynamic Table.
Supported aggregate functions: min, max, count, sum, and avg.
Rollup aggregation with DISTINCT is not supported, except in the matching dimensions scenario where the result can be read directly from the Dynamic Table.

Aggregate function mapping:

Original aggregate function	Required aggregate column	Rewritten aggregate function
`sum(x)`	`sum(x)`	`sum(sum)`
`count(x)`	`count(x)`	`sum(count)`
`min(x)`	`min(x)`	`min(min)`
`max(x)`	`max(x)`	`max(max)`
`avg(x)`	`sum(x)` and `count(x)`	`sum(sum) / sum(count)`

Example:

-- Create the base table
CREATE TABLE base_sales_table(
    day text not null,
    hour int,
    amount int
);
-- Insert data
INSERT INTO base_sales_table
VALUES  ('20250529', 12, 1),
        ('20250529', 12, 2),
        ('20250529', 12, 2),
        ('20250529', 13, 3),
        ('20250530', 13, 4),
        ('20250530', 14, 5),
        ('20250531', 14, 6);
-- Create the Dynamic Table. To verify the rewrite, we first disable automatic refresh.
CREATE DYNAMIC TABLE dt_sales
WITH (
    freshness = '1 minutes',
    auto_refresh_mode='incremental',
    auto_refresh_enable='false',
    allowed_to_rewrite_query='true'
)
AS
SELECT
    day,
    hour,
    min(amount),
    max(amount),
    sum(amount),
    count(amount),
    count(*) as rows,
    count(1) as rows1,
    count(distinct amount) as cd
FROM base_sales_table
GROUP BY day, hour;
-- Manually refresh the Dynamic Table
REFRESH TABLE dt_sales;

Query example 1: Aggregate by day (rollup). The GROUP BY column in the query is a subset of the dimensions in the Dynamic Table definition, so the query can be rewritten.

-- Original query
EXPLAIN SELECT day, min(amount), max(amount)
FROM base_sales_table
GROUP BY day;

The execution plan shows that the lowest-level scan operator is Seq Scan on dt_sales, indicating the query reads from the Dynamic Table instead of the base table base_sales_table.

QUERY PLAN
Gather  (cost=0.00..5.00 rows=4 width=16)
  ->  HashAggregate  (cost=0.00..5.00 rows=4 width=16)
        Group Key: day
        ->  Redistribution  (cost=0.00..5.00 rows=5 width=16)
              Hash Key: day
              ->  Local Gather  (cost=0.00..5.00 rows=5 width=16)
                    ->  Seq Scan on dt_sales  (cost=0.00..5.00 rows=5 width=16)
Query Queue: init_warehouse.default_queue
Optimizer: HQO version 4.1.0

Query example 2: sum + count + avg rollup. The base table query uses avg. Because the Dynamic Table contains pre-computed sum and count values, avg can be derived, and the query is rewritten.

-- Original query
EXPLAIN SELECT day, sum(amount), count(amount), avg(amount)
FROM base_sales_table
GROUP BY day;

The execution plan shows that the query has been rewritten to scan the Dynamic Table dt_sales. Note that the underlying operator is Seq Scan on dt_sales, not base_sales_table.

QUERY PLAN
Gather  (cost=0.00..5.00 rows=4 width=32)
  -> Project  (cost=0.00..5.00 rows=4 width=32)
    -> Project  (cost=0.00..5.00 rows=4 width=40)
      -> HashAggregate  (cost=0.00..5.00 rows=4 width=24)
            Group Key: day
            -> Redistribution  (cost=0.00..5.00 rows=5 width=24)
                  Hash Key: day
                  -> Local Gather  (cost=0.00..5.00 rows=5 width=24)
                        -> Seq Scan on dt_sales  (cost=0.00..5.00 rows=5 width=24)
Query Queue: init_warehouse.default.queue
Optimizer: HQO version 4.1.0

Rollup aggregation with filter conditions

Conditions:

The query on the base table includes a WHERE clause, while the Dynamic Table definition cannot have a WHERE clause.
All columns used in the WHERE clause must be part of the Dynamic Table's GROUP BY dimensions.
Supported aggregate functions are min, max, count, sum, and avg. DISTINCT is not supported.

Example:

-- Create the base table
CREATE TABLE base_sales_table(
    day text not null,
    hour int,
    amount int
);
-- Insert data
INSERT INTO base_sales_table
VALUES  ('20250529', 12, 1),
        ('20250529', 12, 2),
        ('20250529', 12, 2),
        ('20250529', 13, 3),
        ('20250530', 13, 4),
        ('20250530', 14, 5),
        ('20250531', 14, 6);
-- Create the Dynamic Table. To verify the rewrite, we first disable automatic refresh.
CREATE DYNAMIC TABLE dt_sales
WITH (
    freshness = '1 minutes',
    auto_refresh_mode='incremental',
    auto_refresh_enable='false',
    allowed_to_rewrite_query='true'
)
AS
SELECT
    day,
    hour,
    min(amount),
    max(amount),
    sum(amount),
    count(amount),
    count(*) as rows,
    count(1) as rows1,
    count(distinct amount) as cd
FROM base_sales_table
GROUP BY day, hour;
-- Manually refresh the Dynamic Table
REFRESH TABLE dt_sales;

In the following example, the query on the base table has a WHERE clause, and the filter columns are part of the GROUP BY key. Therefore, the query can be rewritten.

EXPLAIN SELECT day, sum(amount), count(amount), avg(amount)
FROM base_sales_table
WHERE day > '20250528' AND day <= '20250531'
GROUP BY day;

Running this EXPLAIN statement returns a query plan where the Seq Scan on dt_sales operator confirms that the query has been rewritten to scan the Dynamic Table instead of the base table. The filter condition matches the original WHERE clause.

QUERY PLAN
Gather  (cost=0.00..5.00 rows=1 width=32)
  ->  Project  (cost=0.00..5.00 rows=1 width=32)
    ->  Project  (cost=0.00..5.00 rows=1 width=40)
      ->  HashAggregate  (cost=0.00..5.00 rows=1 width=24)
            Group Key: day
        ->  Redistribution  (cost=0.00..5.00 rows=1 width=24)
              Hash Key: day
          ->  Local Gather  (cost=0.00..5.00 rows=1 width=24)
            ->  Seq Scan on dt_sales  (cost=0.00..5.00 rows=1 width=24)
                  Filter: ((day > '20250528'::text) AND (day <= '20250531'::text))
                  RowGroupFilter: ((day > '20250528'::text) AND (day <= '20250531'::text))
Query Queue: init_warehouse.default_queue
Optimizer: HQO version 4.1.0

Check the query rewrite status

After enabling query rewrite, you can verify if a query used a Dynamic Table in the following ways:

Check the execution plan: In the EXPLAIN output, check the table name in the Scan operator to determine if the query used a Dynamic Table.
Check the slow query log: In the hologres.hg_query_log table, the extended_info field records which Dynamic Table was used for the rewrite. If a rewrite fails, this field includes the failure reason.

select extended_info::json->>'rewrite_query_info' from hologres.hg_query_log where query_id = 'xxxxx';
                                                                                                               ?column?                                                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 {"rewrite_failed_dt": [[\"public.dt3\", {\"rewrite_failed_cause\": \"Doesn't include all query required output columns\"}]], \"rewrite_succeeded_and_selected_dt\": [\"public.dt2\"], \"rewrite_succeeded_but_not_selected_dt\": [\"public.dt1\"]}
(1 row)

Examples

Example 1: Hologres internal table

The base table is the 100 GB lineitem table from the TPC-H dataset. For table creation and data import instructions, see Import Public Datasets with One Click. In this example, the Dynamic Table uses incremental refresh and is not partitioned.

CREATE DYNAMIC TABLE dt_lineitem_100g_incremental
WITH (
    freshness = '10 minutes',
    auto_refresh_mode='incremental',
    auto_refresh_enable='false',
    allowed_to_rewrite_query='true')
AS 
select
        l_returnflag,
        l_linestatus,
        l_shipdate,
        sum(l_quantity) as sum_qty,
        count(*) as count_order
from
        hologres_dataset_tpch_100g.lineitem
group by
        l_returnflag,
        l_linestatus,
        l_shipdate;
-- Manually refresh
REFRESH DYNAMIC TABLE dt_lineitem_100g_incremental;

Query the base table:

set hg_enable_query_rewrite = on;
explain 
select
        l_returnflag,
        l_linestatus,
        l_shipdate,
        sum(l_quantity) as sum_qty,
        count(*) as count_order
from
        hologres_dataset_tpch_100g.lineitem
where  l_shipdate =  '1998-12-01' 
group by
        l_returnflag,
        l_linestatus,
        l_shipdate;

The execution plan shows that the query has been rewritten to query the Dynamic Table:

QUERY PLAN
Gather  (cost=0.00..5.00 rows=1 width=22)
  -> Local Gather  (cost=0.00..5.00 rows=1 width=22)
    -> Project  (cost=0.00..5.00 rows=1 width=22)
      -> Seq Scan on dt_lineitem_100g_incremental  (cost=0.00..5.00 rows=1 width=20)
            Filter: (l_shipdate = '1998-12-01'::date)
            RowGroupFilter: (l_shipdate = '1998-12-01'::date)
Query Queue: init_warehouse.default_queue
Optimizer: HQO version 4.1.0

The result of querying the base table is as follows:

 l_returnflag | l_linestatus | l_shipdate | sum_qty  | count_order 
--------------+--------------+------------+----------+-------------
 N            | O            | 1998-12-01 | 52841.00 |        2070
(1 row)

Querying the Dynamic Table directly returns the same result, which is consistent with the last manual refresh (automatic refresh was disabled for this example).

select
        l_returnflag,
        l_linestatus,
        l_shipdate,
        sum_qty,
        count_order
from
        dt_lineitem_100g_incremental
where  l_shipdate =  '1998-12-01' ;
 l_returnflag | l_linestatus | l_shipdate | sum_qty  | count_order 
--------------+--------------+------------+----------+-------------
 N            | O            | 1998-12-01 | 52841.00 |        2070
(1 row)

Example 2: Paimon foreign table

Query rewrite can also be used when the base table is a Paimon foreign table. Follow these steps:

Prepare a Paimon table: In this example, import the TPC-H 100 GB customer table into Paimon. For more information, see Paimon Table.
Create a Paimon foreign table in Hologres: You must create the table as a Foreign Table. For details, see Access Paimon Data via DLF Catalog.

-- Create a foreign server
CREATE SERVER IF NOT EXISTS paimon_server FOREIGN DATA WRAPPER dlf_fdw OPTIONS (
    catalog_type 'paimon',
    metastore_type 'dlf-rest', 
    dlf_catalog '<dlf_catalog_name>'
);
-- Use IMPORT FOREIGN SCHEMA to create the Paimon foreign table
IMPORT FOREIGN SCHEMA <schema_name>
limit to (customer) 
FROM SERVER paimon_server into public
options (if_table_exist 'update');
-- Query data
SELECT * FROM customer;

3. Create a Dynamic Table to incrementally consume the Paimon foreign table: In Hologres, create a Dynamic Table that incrementally refreshes from the Paimon foreign table. For this example, automatic refresh is disabled to make it easier to verify the rewrite.

-- Create the Dynamic Table
CREATE DYNAMIC TABLE dt_paimon_customer
WITH (
    freshness = '10 minutes',
    auto_refresh_mode='incremental',
    auto_refresh_enable='false',
    allowed_to_rewrite_query='true')
AS
SELECT
     c_custkey,
     avg(c_acctbal) ,
     sum(c_acctbal) ,
     count(c_acctbal)
FROM customer
group by c_custkey;
-- Manually refresh the Dynamic Table
REFRESH DYNAMIC TABLE dt_paimon_customer;

4. Query the Paimon foreign table with query rewrite enabled.

set hg_enable_query_rewrite = on;
SELECT
     c_custkey,
     avg(c_acctbal) ,
     sum(c_acctbal) ,
     count(c_acctbal)
FROM
   customer
group by c_custkey ORDER BY 3 DESC  LIMIT 3;
c_custkey	| avg	        |sum	    |count
----------|-------------|---------|-----
3605586	  |9999.990000  |	9999.99	|1
10705496	|9999.990000	|9999.99	|1
14959900	|9999.990000	|9999.99	|1

5. Query the Dynamic Table: The result reflects the data from the most recent refresh.

SELECT * FROM dt_paimon_customer ORDER BY 3 DESC  LIMIT 3;
c_custkey	| avg	        |sum	    |count
----------|-------------|---------|-----
3605586	  |9999.990000  |	9999.99	|1
10705496	|9999.990000	|9999.99	|1
14959900	|9999.990000	|9999.99	|1

6. Confirm with the execution plan: The plan shows the query was rewritten to access the Dynamic Table.

QUERY PLAN
Limit  (cost=0.00..5.07 rows=1 width=28)
  ->  Project  (cost=0.00..5.07 rows=1 width=28)
    ->  Limit  (cost=0.00..5.07 rows=1 width=36)
      ->  Sort  (cost=0.00..5.07 rows=1 width=36)
            Sort Key: sum DESC
            ->  Gather  (cost=0.00..5.07 rows=1 width=36)
              ->  Local Gather  (cost=0.00..5.07 rows=1 width=36)
                ->  Limit  (cost=0.00..5.07 rows=1 width=36)
                  ->  Sort  (cost=0.00..5.07 rows=1 width=36)
                        Sort Key: sum DESC
                        ->  Project  (cost=0.00..5.07 rows=1 width=36)
                          ->  Project  (cost=0.00..5.07 rows=1 width=20)
                            ->  Seq Scan on dt_paimon_customer  (cost=0.00..5.07 rows=15000000 width=17)
Query Queue: init_warehouse.default_queue
Optimizer: HQO version 4.1.0