ALTER TABLE ... ADD COLUMN acquires an exclusive lock on the table. If another session holds a conflicting lock — from autovacuum or a long-running transaction — the ADD COLUMN statement waits. During that wait, every new query arriving on the same table is queued behind it, which can drive reads and writes to zero until the DDL completes or is canceled. This topic explains why this happens and provides two solutions that limit the impact on your workloads by configuring a lock wait timeout.
How lock contention occurs
ALTER TABLE ... ADD COLUMN requires an exclusive lock on the table, which conflicts with every other lock mode — including the locks held by concurrent reads and writes. The table below shows which common operations conflict with ALTER TABLE:
| Runs concurrently with | SELECT |
INSERT / UPDATE / DELETE |
VACUUM / ANALYZE |
ALTER TABLE / DROP TABLE |
|---|---|---|---|---|
SELECT |
Yes | Yes | Yes | No |
INSERT / UPDATE / DELETE |
Yes | Yes | Yes | No |
ALTER TABLE / DROP TABLE |
No | No | No | No |
The two most common causes that prevent ALTER TABLE from acquiring the lock immediately are:
-
Autovacuum: Autovacuum can block DDL operations.
-
Long-running transactions: A transaction that started before the DDL and is still open holds the table lock until it commits or rolls back.
The lock queue cascade
When ALTER TABLE cannot immediately acquire the lock, it enters a wait queue. PostgreSQL then requires every subsequent query on that table to wait behind the ALTER TABLE — even queries that would normally run concurrently with each other. A DDL statement that is itself blocked therefore becomes a barrier that blocks all traffic behind it, causing a complete TPS drop until the lock is released.
Operations that trigger a full table rewrite
Most ADD COLUMN operations complete in seconds without rewriting the table:
-
No default value: Completes within seconds in all PostgreSQL versions.
-
Non-volatile default value (for example, a constant string or integer): Completes within seconds on PostgreSQL 11 and later. The default is stored in table metadata rather than written to every row. For details, see PostgreSQL documentation.
Solutions
Solution 1: Cancel the DDL operation on lock wait timeout
Set a transaction-level lock_timeout. If the DDL cannot acquire the exclusive lock within the specified period, it is automatically canceled rather than continuing to block subsequent queries.
BEGIN;
SET LOCAL lock_timeout = 500; -- Cancel if the lock is not acquired within 500 ms.
ALTER TABLE <table_name> ADD COLUMN <column_name> VARCHAR;
COMMIT;
The lock_timeout value directly affects the trade-off between DDL success rate and workload impact. A short timeout minimizes the blocking window but may require more retries. A long timeout reduces retries but blocks the table longer on each attempt. See Stress testing results for measured TPS impact at different timeout values.
Solution 2: Cancel and automatically retry until success (recommended)
Combine a lock_timeout with an automatic retry loop. The DDL retries every second until it succeeds, so each failed attempt blocks the table for at most lock_timeout milliseconds.
DO $$
DECLARE
msg text;
BEGIN
LOOP
BEGIN
PERFORM pg_sleep(1); -- Wait 1 second before each attempt.
SET LOCAL lock_timeout = 500; -- Cancel if the lock is not acquired within 500 ms.
ALTER TABLE <table_name> ADD COLUMN <column_name> VARCHAR;
EXIT; -- Exit the loop on success.
EXCEPTION WHEN OTHERS THEN
GET STACKED DIAGNOSTICS msg = MESSAGE_TEXT;
RAISE NOTICE 'failed, error: %s', msg;
END;
END LOOP;
RAISE NOTICE 'success!';
END;
$$;
This approach is preferred over Solution 1 for production use: it eventually completes without manual intervention, and each failed attempt blocks the table for no longer than lock_timeout milliseconds.
Terminate the blocking connection (optional)
Terminating a connection disconnects it immediately. Use this approach only when the blocking session cannot be resolved by other means.
If a specific connection is blocking the DDL operation, identify and terminate it.
-
Get the process ID (PID) of the current session.
SELECT pg_backend_pid(); -
Run the DDL operation.
ALTER TABLE <table_name> ADD COLUMN <column_name> VARCHAR; -
Query the process that blocks the DDL operation.
SELECT pg_blocking_pids(<The PID obtained in step 1>); -
Terminate the blocking process. Choose one of the following:
-
Cancel the current query without disconnecting the session:
SELECT pg_cancel_backend(<pid_from_step_3>); -
Forcibly disconnect the session:
SELECT pg_terminate_backend(<pid_from_step_3>);
NoteTerminating an autovacuum process may fail. PostgreSQL protects autovacuum workers from forced termination in some configurations.
-
Choosing a lock_timeout value
lock_timeout value |
DDL behavior | Workload impact |
|---|---|---|
| Not set | Waits indefinitely until the lock is acquired or the statement times out | TPS drops to 0 for the full wait duration |
| Short (for example, 100 ms) | Canceled quickly if the lock is not available; may require many retries | Minimal impact per attempt |
| Long (for example, 500 ms) | Fewer retries needed but blocks the table longer on each attempt | Noticeable impact per attempt |
Start with 100–500 ms and adjust based on your observed retry frequency and TPS sensitivity. See Stress testing results for measured values.
Stress testing results
The following results show how lock_timeout affects TPS when lock contention occurs during ADD COLUMN. The test uses Sysbench (oltp_read_only) on a test table for 300 seconds, with long-running transactions running concurrently.
| Stage | Operation | TPS impact |
|---|---|---|
| 1 | Sysbench oltp_read_only + long-running transactions |
Baseline |
| 2 | ADD COLUMN without lock_timeout, ~30 seconds |
Drops to 0 |
| 3 | ADD COLUMN with lock_timeout = 100 ms (Solution 2), ~30 seconds |
Slightly affected |
| 4 | ADD COLUMN with lock_timeout = 500 ms (Solution 2), ~30 seconds |
Significantly affected |
| 5 | Terminate the long-running transaction (optional step) | DDL completes immediately |
Key observations:
-
Without
lock_timeout, a blockedADD COLUMNbrings TPS to 0 for the entire duration of the block. -
With
lock_timeout = 100ms, the impact is minimal because each failed attempt releases the lock quickly. -
With
lock_timeout = 500ms, the impact is more noticeable because each attempt blocks the table for up to 500 ms. -
Terminating the blocking transaction is the most direct way to unblock the DDL, but it disrupts the terminated session.
What's next
-
Performance test on ApsaraDB RDS for PostgreSQL: Run Sysbench benchmarks on your instance.
-
View standard monitoring metrics and View enhanced monitoring: Monitor transactions per second (TPS) and other performance metrics in real time.