Enable the spatio-temporal parallel query feature

更新时间:
复制 MD 格式

Large-scale spatio-temporal queries on GanosBase can be slow when processed sequentially. By enabling PostgreSQL's parallel query engine, GanosBase distributes table scans across multiple worker processes, reducing query time on large datasets.

How parallel query works

Parallel query operates at the table level. When the query planner selects a parallel plan, it spawns worker processes to scan different portions of the table simultaneously. A Gather node collects and merges the results from all workers.

The following EXPLAIN output shows a typical parallel query plan:

EXPLAIN SELECT * FROM spatial_table WHERE ST_Within(geom, ST_MakeEnvelope(...));
Gather  (cost=1000.00..85000.00 rows=100 width=200)
  Workers Planned: 2
  ->  Parallel Seq Scan on spatial_table  (cost=0.00..84000.00 rows=42 width=200)
        Filter: ST_Within(geom, ...)

A Gather or Gather Merge node in the plan confirms that parallel execution is active. If the plan shows only a sequential scan, parallel query is not running—see Verify parallel query for troubleshooting steps.

Prerequisites

Before you begin, make sure you have:

  • An ApsaraDB RDS for PostgreSQL instance with the GanosBase extension enabled

  • Permission to modify postgresql.conf and to run ALTER TABLE and ALTER FUNCTION

Configure parallel query

Step 1: Set parameters in postgresql.conf

Configure the following parameters in postgresql.conf:

ParameterDescriptionValid rangeConstraint
max_parallel_workersTotal number of parallel workers available to the instance8–32Must be less than max_worker_processes
max_parallel_workers_per_gatherMaximum number of workers per Gather node2–4Must be less than max_parallel_workers
force_parallel_modeForce the planner to generate a parallel planon / offUse for testing only; remove in production

To set the number of workers for a specific table, run:

ALTER TABLE table_name SET (parallel_workers = n);

Set n to a value within the range defined by max_parallel_workers_per_gather.

Step 2: Increase the cost of GanosBase functions

After a GanosBase module extension is created, its functions have a low default cost. When a table has few rows, the planner treats a sequential scan as cheaper than a parallel plan and skips parallelism.

If a function is computing-intensive and parallel query is suitable for it, increase the function cost so the planner selects a parallel plan:

ALTER FUNCTION function_name COST new_cost;
Note

Increasing the cost tells the planner that the function is expensive enough to justify spawning workers.

Verify parallel query

After configuring the parameters, confirm that the planner generates a parallel plan:

EXPLAIN SELECT * FROM your_spatial_table WHERE <spatio-temporal_condition>;

Look for a Gather or Gather Merge node in the output. If the plan shows only a sequential scan, check the following:

  • Confirm max_parallel_workers and max_parallel_workers_per_gather are set correctly in postgresql.conf and that the instance has reloaded the configuration.

  • If the table is small, the planner may still prefer a sequential scan. Use force_parallel_mode = on temporarily to verify that the function can run in parallel, then remove it from the configuration.

  • To see how work is distributed across workers, run EXPLAIN (ANALYZE, VERBOSE) to display per-worker execution statistics.

Usage notes

Apply these settings based on your workload profile:

  • High CPU load: If the server CPU is already under heavy load, set max_parallel_workers_per_gather to 2 to limit worker count and reduce CPU pressure.

  • High concurrency with limited memory: Make sure that concurrent_requests × parallel_workers × work_mem does not exceed 60% of total server memory. The minimum value for work_mem is 64 KB.

  • Small tables: If a GanosBase module extension table has few rows, the query planner disables parallel query by default. For computing-intensive functions on small tables, increase the function cost as described in Step 2.