pg_squeeze (tablespace reorganization)

更新时间:
复制 MD 格式

Frequent UPDATE and DELETE operations cause table bloat, wasting storage and degrading query performance. The pg_squeeze extension reclaims bloated space through online table reorganization using logical decoding, with minimal write performance impact and no service interruption.

Comparison with pg_repack

pg_squeeze and pg_repack both manage table bloat but differ in mechanism and operation. Choose pg_squeeze or pg_repack based on your workload.

Item

pg_squeeze

pg_repack

Mechanism

Logical decoding: Uses a replication slot to asynchronously capture data changes from WAL.

Trigger: Creates a trigger on the source table to synchronously capture DML operations.

Impact on write performance

Minimal. Asynchronous change capture has limited impact on write performance.

Significant. Each DML fires an additional write, creating bottlenecks under high concurrency.

Resource consumption

Uses one replication slot. Slow or failed tasks can cause WAL accumulation.

Triggers increase CPU overhead under high concurrency. Requires an extra log table to record changes.

Operational complexity

Monitor replication slot status and WAL storage to prevent disk exhaustion from task failures.

Manage triggers to ensure correct operation after schema changes or migrations.

Quick start

Install, configure, and enable automatic compaction for a table.

Step 1: Configure cluster parameters

  1. Log on to the PolarDB console. On the Clusters page, find the target cluster and click its ID.

  2. In the left-side navigation pane, choose Settings and Management > Parameters.

  3. On the Parameters page, modify the following parameters:

    • wal_level: Set to logical. Required for logical decoding; slightly increases WAL generation.

    • max_replication_slots: pg_squeeze needs at least 1 + squeeze.workers_per_database replication slots per enabled database. Increase by at least 2 (e.g., from 10 to 12).

    • shared_preload_libraries: Append ,pg_squeeze to the existing value. For example, if the original value is pg_stat_statements, change it to pg_stat_statements,pg_squeeze.

  4. Restart the cluster for the changes to take effect.

Step 2: Install and start the extension

Enable pg_squeeze in the target database.

  1. Connect to the target database.

  2. Run the following SQL statements to create the extension and start the scheduler worker. This process periodically checks for and initiates compaction tasks based on the configuration in squeeze.tables.

    -- Create the extension in the target database
    CREATE EXTENSION pg_squeeze;
    
    -- Start the background scheduler process
    -- This operation only needs to be performed once for each database where pg_squeeze will be used
    SELECT squeeze.start_worker();
  3. (Optional) Verify the background process started. A row with application_name = squeeze scheduler confirms success.

    SELECT * FROM pg_stat_activity WHERE application_name LIKE 'squeeze%';

Step 3: Configure tables to process

Register tables for pg_squeeze and define their compaction schedule.

  1. Ensure the table has a replica identity: pg_squeeze uses this to match rows between old and new tables. A primary key serves as the default replica identity.

    • Check: Run \d+ your_table_name and verify replica identity is DEFAULT or FULL.

    • Set: If the table has no primary key but has a unique index (e.g., your_table_uidx), set it as the replica identity:

      ALTER TABLE your_table_name REPLICA IDENTITY USING INDEX your_table_uidx;
  2. Register the table in squeeze.tables: Insert table information and scheduling policy into squeeze.tables. The schedule field uses a cron-like format.

    -- Example 1: Check the public.foo table every day at 2:30 AM
    INSERT INTO squeeze.tables (tabschema, tabname, schedule)
    VALUES ('public', 'foo', ROW(ARRAY[30], ARRAY[2], NULL, NULL, NULL)::squeeze.schedule);
    
    -- Example 2: Check the public.bar table every Sunday at 4:00 AM and sort by its primary key after compaction
    INSERT INTO squeeze.tables (tabschema, tabname, schedule, clustering_index)
    VALUES ('public', 'bar', ROW(ARRAY[0], ARRAY[4], NULL, NULL, ARRAY[0])::squeeze.schedule, 'bar_pkey');

    The structure of the schedule field is (minutes, hours, days of month, months, days of week). NULL represents any value. All available fields are listed in Appendix: squeeze.tables configuration table.

Step 4: Monitor and verify

Monitor compaction task status through pg_squeeze views.

-- View active compaction tasks in the current database
SELECT * FROM squeeze.get_active_workers();

-- View the 10 most recent successful compaction history records
SELECT * FROM squeeze.log ORDER BY finished DESC LIMIT 10;

-- View failed compaction records and their causes
SELECT * FROM squeeze.errors;

Step 5: (Optional) Disable scheduled compaction

To stop scheduled compaction for a table, delete its row from squeeze.tables.

DELETE FROM squeeze.tables WHERE tabschema = 'public' AND tabname = 'foo';

Manual compaction

Trigger one-time compaction for any table without registering it in squeeze.tables.

Call squeeze.squeeze_table() to start a manual compaction.

-- Initiate a default compaction task for the public.pgbench_accounts table
SELECT squeeze.squeeze_table('public', 'pgbench_accounts');

-- Initiate a compaction task and physically sort the table by its primary key (pgbench_accounts_pkey) upon completion
SELECT squeeze.squeeze_table('public', 'pgbench_accounts', 'pgbench_accounts_pkey');
Note

squeeze.squeeze_table is non-transactional — it starts a background worker and returns immediately. Rolling back the calling transaction does not stop an already-started compaction.

Advanced configuration

Tune concurrency and lock behavior

Set squeeze.max_xlock_time to limit how long the exclusive lock is held during the final compaction phase, preventing long blocks on business queries.

-- Set the maximum exclusive lock holding time to 100 milliseconds
SET squeeze.max_xlock_time TO 100;

If the lock exceeds this limit, pg_squeeze releases it, processes pending changes, and retries. Repeated timeouts cause the task to fail. Schedule compaction during off-peak hours, or increase this value if timeouts persist.

Configure concurrent workers

Set squeeze.workers_per_database greater than 1 to compact multiple tables in parallel within a single database.

-- Allow a maximum of 2 concurrent compaction worker processes per database
SET squeeze.workers_per_database = 2;
Note

This applies to all databases using pg_squeeze. Total workers (including the scheduler) cannot exceed max_worker_processes.

Appendix: squeeze.tables configuration table

Register tables and define pg_squeeze compaction policies in squeeze.tables.

Parameter

Type

Description

Default

Recommendations

tabschema

name

The schema name of the table.

-

-

tabname

name

The name of the table.

-

-

schedule

squeeze.schedule

The schedule for task checks, in the format (minutes, hours, days of month, months, days of week).

-

Schedule during off-peak hours. Example for 2:30 AM daily: ROW(ARRAY[30], ARRAY[2], NULL, NULL, NULL)::squeeze.schedule.

free_space_extra

integer

The additional bloat percentage required to trigger compaction (beyond the space reserved by fillfactor).

50

For write-intensive OLTP tables, lower to 30 to reclaim space more frequently.

min_size

integer

The minimum table size in MB. Tables smaller than this will not be processed.

8

Keep default to skip small tables.

vacuum_max_age

interval

The maximum time since the last VACUUM for which the free space map (FSM) is considered valid.

'1 hour'

Keep default. After this interval, pg_squeeze performs deeper analysis before compacting.

max_retry

integer

The maximum number of retries allowed if the initial compaction fails.

0

Increase (e.g., to 2) if compaction frequently fails due to DDL changes.

clustering_index

name

Specifies an existing index. After compaction, the data in the table will be physically sorted according to the key of this index, similar to the CLUSTER command.

NULL

For range queries, set a primary key or core index as the clustering index to improve performance.

rel_tablespace

name

Specifies the target tablespace to which the table will be moved after compaction.

NULL

Used for tablespace migration scenarios. If NULL, the table remains in its original tablespace.

ind_tablespaces

name[]

A two-dimensional array used to specify tablespace mappings for indexes.

NULL

Used for index tablespace migration.

skip_analyze

boolean

Specifies whether to skip the ANALYZE command after compaction.

false

Keep false. Compaction changes data distribution, and ANALYZE updates optimizer statistics.