Execution plans-AnalyticDB(AnalyticDB)-阿里云帮助中心

The query optimizer selects the execution plan with the lowest estimated cost for each query, using statistics maintained by the database. Cost is measured in disk I/O, expressed as the number of disk page fetches. Use EXPLAIN and EXPLAIN ANALYZE to inspect and optimize query plans.

How it works

A query plan is a tree of nodes, where each node represents a single operation — a table scan, a join, an aggregation, or a sort. Read plans from the bottom up: each node feeds its output rows into the node directly above it.

Bottom nodes — table scan operations such as sequential scans, index scans, or bitmap index scans.
Middle nodes — operations such as joins, aggregations, and sorts, stacked above the scan nodes.
Top nodes — motion nodes (redistribute, broadcast, or gather) that move rows between compute nodes.

The cost of any node includes the cumulative cost of all its child nodes. The topmost node carries the estimated total execution cost — the number the optimizer minimizes. Cost does not include the time to transmit result rows to the client.

EXPLAIN output

EXPLAIN prints the query plan without executing the query. Each line in the output corresponds to one plan node and shows three estimates:

Field	Description
`cost`	Expressed as `startup_cost..total_cost`. Startup cost is the estimated cost to return the first row; total cost covers all rows. The unit is disk page fetches, where 1.0 equals one sequential disk page read.
`rows`	Estimated number of rows this node produces. For the topmost node, this approximates the rows the query returns, updates, or deletes. Filters in `WHERE` clauses reduce this below the number of rows scanned.
`width`	Estimated total size in bytes of all rows produced by this node.

Read an EXPLAIN output

EXPLAIN SELECT * FROM names WHERE name = 'Joelle';

                     QUERY PLAN
------------------------------------------------------------
Gather Motion 4:1 (slice1) (cost=0.00..20.88 rows=1 width=13)

   -> Seq Scan on 'names' (cost=0.00..20.88 rows=1 width=13)
         Filter: name::text ~~ 'Joelle'::text

Reading from the bottom up:

Seq Scan — scans the names table row by row, applying the WHERE name = 'Joelle' filter. Estimated startup cost: 0.00; total cost: 20.88; estimated rows returned: 1.
Gather Motion 4:1 — four compute nodes send their filtered rows to the coordinator node. The 4:1 notation means four senders, one receiver.

EXPLAIN ANALYZE output

EXPLAIN ANALYZE both plans and executes the query, so it shows actual timings alongside the optimizer's estimates. It reports the following additional information:

Total runtime — wall-clock time in milliseconds for the full query.
Memory per slice — executor memory used by each query slice, plus the total memory reserved for the statement.
Compute nodes involved — number of compute nodes that returned rows for each plan node operation.
Max rows — the highest row count returned by any single compute node for a given operation. If multiple nodes return the same row count, this reflects the slowest node.
Node ID — the compute node that returned the most rows.
work_mem — memory used per operation. If a node spills data to disk, the output shows both the memory used and the memory that would have prevented the spill.
Timing — time for the busiest compute node to return the first row and all rows.

Read an EXPLAIN ANALYZE output

EXPLAIN ANALYZE SELECT * FROM names WHERE name = 'Joelle';

                     QUERY PLAN
------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..20.88 rows=1 width=13)
Rows out: 1 rows at destination with 0.305 ms to first row, 0.537 ms to end, start offset by 0.289 ms.
        -> Seq Scan on names (cost=0.00..20.88 rows=1 width=13)
Rows out: Avg 1 rows x 2 workers. Max 1 rows (seg0) with 0.255 ms to first row, 0.486 ms to end, start offset by 0.968 ms.
                 Filter: name = 'Joelle'::text
 Slice statistics:

      (slice0) Executor memory: 135K bytes.

    (slice1) Executor memory: 151K bytes avg x 2 workers, 151K bytes max (seg0).

Statement statistics: Memory used: 128000K bytes Total runtime: 22.548 ms

Key observations:

Seq Scan — only seg0 returned rows (1 row). Time to first row: 0.255 ms; time to scan all rows: 0.486 ms.
Gather Motion — received 1 row from the compute nodes. Total operation time: 0.537 ms.
Total runtime — 22.548 ms.
Memory — slice1 used 151K bytes per worker; the full statement reserved 128,000K bytes.

Query operators

Scan operators

Scan operators retrieve rows from tables.

Operator	Description
Seq Scan	Scans all rows in a table sequentially.
Append-only Scan	Scans append-optimized row-oriented tables.
Append-only Columnar Scan	Scans append-optimized column-oriented tables.
Index Scan	Traverses a B-tree index to fetch rows from a table.
Bitmap Append-only Row-oriented Scan	Gathers row pointers from an index for an append-optimized table, then sorts them by disk location before fetching.
Dynamic Table Scan	Selects which partitions to scan at runtime using one of the following functions, contained in a Function Scan node: `gp_partition_expansion` (all partitions), `gp_partition_selection` (equality-based), or `gp_partition_inversion` (range-based). The Function Scan node passes the selected partition list to the Result node, which passes it to the Sequence node.

Join operators

Join operators combine rows from two tables.

Operator	Description	When to use
Hash Join	Builds a hash table from the smaller table using the join column as a hash key, then probes it while scanning the larger table. The `Hash Cond` field in the plan identifies the joined columns.	Default for most joins; typically the fastest option.
Nested Loop Join	Iterates through every row in the larger table and scans the smaller table for each iteration. Requires broadcasting one table to all compute nodes.	Small tables or queries constrained by an index. Avoid for large tables. Set `enable_nestloop=off` (the default) to let the optimizer prefer Hash Join instead.
Merge Join	Sorts both tables and merges them.	Pre-ordered data. Set `enable_mergejoin=on` to prefer this join type.

Motion operators

Motion operators move rows between compute nodes.

Operator	Description
Broadcast motion	Each compute node sends a full copy of its rows to every other compute node. Used for small tables. Not suitable for large tables.
Redistribute motion	Each compute node rehashes its rows and sends them to the appropriate compute node based on the hash key. Used when data is not already distributed on the join key.
Gather motion	All compute nodes send their results to the coordinator node. This is the final operation in most query plans.

Other operators

Operator	Description
Materialize	Materializes a subquery result.
InitPlan	Executes a pre-query used in dynamic partition elimination, when the optimizer does not know partition values ahead of time.
Sort	Sorts rows before operations that require ordering, such as Merge Join or aggregation.
Group By	Groups rows by one or more columns.
Group/Hash Aggregate	Aggregates rows using a hash algorithm.
Append	Concatenates row sets scanned from multiple partitions of a partitioned table.
Filter	Filters rows using the criterion from a `WHERE` clause.
Limit	Caps the number of rows returned.

Identify the active query optimizer

AnalyticDB for PostgreSQL supports two query optimizers: ORCA (also called GPORCA) and the legacy query optimizer. The end of every EXPLAIN output shows which optimizer generated the plan.

ORCA (optimizer=on):

                       QUERY PLAN
------------------------------------------------------------------------------------
 Aggregate  (cost=0.00..296.14 rows=1 width=8)
   ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..295.10 rows=1 width=8)
         ->  Aggregate  (cost=0.00..294.10 rows=1 width=8)
               ->  Table Scan on part  (cost=0.00..97.69 rows=100040 width=1)
 Settings:  optimizer=on
 Optimizer status: PQO version 1.609
(5 rows)

Legacy query optimizer (optimizer=off):

                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=3519.05..3519.06 rows=1 width=8)
   ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=3518.99..3519.03 rows=1 width=8)
         ->  Aggregate  (cost=3518.99..3519.00 rows=1 width=8)
               ->  Seq Scan on part  (cost=0.00..3018.79 rows=100040 width=1)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(5 rows)

The Settings line shows the optimizer parameter value. The Optimizer status line confirms whether ORCA or the legacy optimizer generated the plan. PQO in the status line stands for the Parallel Query Optimizer, the historical name for GPORCA.