FAQ

更新时间:
复制 MD 格式

This page covers common errors you may encounter when using the lockless change feature in Alibaba Cloud Data Management (DMS), along with their root causes and fixes.

Before you run a lockless change

Check the following before submitting a schema change job. Many errors are preventable.

Timing

  • [ ] Run during off-peak hours. Lockless change replicates data online while the source table is live — high write throughput (transactions per second, or TPS) is the leading cause of delays and failures.

Active transactions

  • [ ] No long-running transactions or large queries are open on the target table. The final step of lockless change requires a brief metadata lock (MDL) to swap tables. An open transaction blocks this step and causes a timeout.

Table structure

  • [ ] The table has a primary key or unique key.

  • [ ] The table has no foreign keys or triggers.

  • [ ] The table does not require partition changes (adding, modifying, or deleting partitions is not supported).

Binlog configuration

  • [ ] Binlog is enabled on the instance (required for PolarDB).

  • [ ] Binlog is set to ROW format, not MIXED.

  • [ ] The account used by DMS has the SUPER or REPLICATION CLIENT privilege.

DDL statement

  • [ ] The Data Definition Language (DDL) statement performs only one type of operation (for example, do not add a unique key and modify a column in the same statement).

  • [ ] The statement does not rename the table and alter it at the same time.

If the table fails any of the checks above, disable lockless change for the instance and use native DDL instead.

Errors and solutions

Excessive incremental replay latency (high TPS on source table)

Full error: The table change failed due to excessive latency in DMS incremental replay caused by the source table's high TPS (Note: Please retry during off-peak hours).

The source table is receiving too many write operations for DMS to catch up. This can happen because:

  • The schema change ran during peak hours.

  • Frequent Data Manipulation Language (DML) operations on other tables filled the instance's Binlog, saturating the DMS subscription bandwidth; data copying generating excessive logs can also contribute to this.

  • Incremental data validation created lock contention, slowing replay further.

Retry during off-peak hours. If the failure repeats, try one or more of the following:

  • Disable lockless change and use native DDL.

  • Reduce the data validation ratio in the job settings.

  • Slow down the copy speed for the source table.

Failed to acquire metadata lock after 5 attempts

Full error: Failed to acquire the metadata lock (MDL) on the source table after 5 attempts.

DMS tried 5 times to acquire a metadata lock for the final table swap but was blocked each time. The most common cause is an open long-running transaction or a large query on the source table.

Retry during off-peak hours. If the problem persists, identify and terminate the blocking transactions or queries before retrying.

Lock wait timeout during row copy

Full error: Lock wait timeout exceeded; try restarting transaction

During row copy, DMS adds a shared read lock to each batch of rows it copies from the source table. If an uncommitted transaction holds a conflicting lock on that row range, DMS cannot proceed and the task stops.

Retry during off-peak hours. If the failure repeats, find and terminate the blocking transactions.

Table has no primary key or unique key

Full error: Unsupported - Schema changes on tables without a primary key or unique key are not supported.

Lockless change requires a primary key or unique key to identify rows during the copy and replay phases. Without one, it cannot guarantee data consistency.

Disable lockless change for the instance and run the DDL as native DDL.

Native DDL lock timeout after 6 retries

Full error: Task terminated - Direct DDL execution failed! Lock wait timeout exceeded; still failed after the 6th retry

DMS fell back to native DDL execution (which acquires a table lock directly), but the lock was blocked by a large transaction, a long-running query, or a pending transaction. It retried 6 times and still could not proceed.

Retry during off-peak hours. If the problem recurs, find and resolve the source of the lock before retrying.

Binlog not enabled on PolarDB

Full error: Unsupported - You must enable the Binlog feature for PolarDB!

Lockless change relies on Binlog to capture ongoing changes to the source table. If Binlog is disabled, DMS cannot run lockless change.

Enable binary logging on the PolarDB instance. For details, see Enable binary logging.

Data validation failed

Full error: Task terminated - [Data validation failed]

DMS detected a data inconsistency between the source table and the ghost table and stopped the task to prevent data corruption. This typically happens when:

  • An incompatible DDL change caused a data transformation or conflicted with a change already in progress.

  • Repeated DML operations (insert, delete, update) on the same unique key produced a state that the replay logic could not reconcile.

Use native DDL for this schema change.

Task cancelled by user

Full error: Task terminated - Direct DDL execution failed! Statement cancelled due to client request

You manually cancelled the DDL execution. The change did not complete.

Submit the job again. Run it during off-peak hours to reduce the risk of a lock timeout.

Table has foreign keys or triggers

Full errors:

  • Unsupported - Changes on tables with foreign keys in the primary table are not supported! Temporarily disable the lockless change for the instance and use native DDL.

  • Unsupported - Changes on tables with foreign keys in the details table are not supported! Temporarily disable the lockless change for the instance and use native DDL.

  • Unsupported - Changes on tables with triggers are not supported! Temporarily disable the lockless change for the instance and use native DDL.

Lockless change creates a ghost table and swaps it with the original. Foreign key constraints and triggers on the source table interfere with this process.

Disable lockless change for the instance and use native DDL.

Primary key differs after ALTER

Full error: Task terminated - The target table after the ALTER statement does not have the same primary key!

The DDL statement deleted the primary key or unique index. Lockless change depends on a stable primary key to map rows between the source and ghost tables — if the key changes, the process cannot safely complete.

Use native DDL.

Unique key operation mixed with other operations

Full error: DDL statements that mix unique key operations with other operations are not supported.

DMS cannot add a unique key via lockless change. When it detects an ADD UNIQUE KEY mixed with other operations, it tries to fall back to native DDL — but native DDL also cannot handle the mixed statement. Neither path works.

Split the DDL into two separate statements:

  1. Run all operations except the unique key change.

  2. Run ADD UNIQUE KEY separately.

Index field too long

Full errors:

  • Init Ghost table fail: Specified key was too long (>767 bytes)

  • Specified key was too long; max key length is 767 bytes

The index field exceeds the InnoDB limit of 767 bytes per index column.

Rewrite the DDL to use a prefix index. For example: KEY(col(255)).

NULL logic error in DDL

Full error: Data truncation: Invalid use of NULL value

The DDL statement contains a logical error — for example, setting a default value of NULL for a NOT NULL column.

Correct the DDL syntax before resubmitting.

Insufficient database permissions

Full error: Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation

The account DMS uses to connect to the instance does not have the permissions required to read Binlog.

Grant the SUPER or REPLICATION CLIENT privilege to the DMS account, or switch to a privileged account for this operation.

Binlog format is not ROW

Full error: Unsupported - Only the ROW log format is supported. Enable the ROW mode for Binlog and retry.

The Binlog format is set to MIXED, not ROW. DMS lockless change only reads ROW format Binlog events.

After switching the instance to ROW format globally, kill existing connections — they retain their session-level format setting and will keep writing MIXED format events. The safest way to clear old connections is to restart or failover the instance. Existing sessions do not pick up the global format change automatically.

MIXED format events still appearing after switching to ROW

Full error: Binlog subscription link interrupted (possible cause: high playback latency). Retry during off-peak hours! Underlying error: Unsupported, non-ROW format DML SQL detected in Binlog: \<SQL fragment\>

The global Binlog format is already ROW, but sessions that connected before the change are still writing MIXED format events. DMS only processes ROW format events and stops when it encounters MIXED format entries.

Prevent MIXED format DML events from reaching the Binlog. Restart or failover the instance to force all connections to reconnect and pick up the ROW format setting.

Unique key allows NULL values

Full error: Task terminated - The unique key after the ALTER statement contains nullable columns. This type of index can cause dirty data! Temporarily disable the lockless change for the instance and use native DDL.

Lockless change requires that all primary key and unique key columns are NOT NULL. A nullable unique key can produce duplicate effective values (multiple NULL rows), which breaks the row-mapping logic and risks dirty data.

Either use native DDL, or rewrite the DDL to define all unique key columns as NOT NULL.

Auto-increment column contains id=0

Full error: id is an auto-increment column, but data with id=0 exists.

MySQL permits id=0 in auto-increment columns, but DMS lockless change may lose rows where id=0 during the copy phase.

Use native DDL, or delete the rows where id=0 before running lockless change.

ALTER TABLE with RENAME

Full error: Unsupported - ALTER TABLE with RENAME is not supported.

Mixing a table rename with other schema changes in a single statement is not supported.

Split the operation into two separate DDL statements: rename first and then alter, or alter first and then rename.

Primary key or unique key value updated during change

Full error: Unsupported - The current increment contains a unique key or primary key change that cannot be executed! Temporarily disable the lockless change for the instance and use native DDL.

DMS uses the primary key or unique key to match rows between the source table and the ghost table. If a key value changes during the operation, the mapping breaks. This happens when:

  • An application or script actively updates primary key or unique key values.

  • The application uses REPLACE INTO or INSERT ... ON DUPLICATE KEY UPDATE, which can cause MySQL to internally delete and reinsert a row with a new key value.

Use native DDL.

Partition changes not supported

Full error: Unsupported - Table partition changes (adding, modifying, or deleting partitions) are not supported.

Lockless change does not support partition operations.

Use native DDL.

Row size or key length exceeds limits

Full errors:

  • Task terminated - Init Ghost table fail. msg:Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs

  • Duplicate key name 'xxx'

  • Task terminated - Init Ghost table fail. msg:Index column size too large. The maximum column size is 767 bytes.

  • Task terminated - Init Ghost table fail. msg:Specified key was too long; max key length is 3072 bytes

  • You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1

The DDL statement contains a structural error — the resulting table would exceed MySQL's row size limit (65,535 bytes, excluding BLOBs), duplicate an existing key name, or violate index size limits (767 bytes per column, 3,072 bytes per key).

Correct the DDL statement before resubmitting. Common fixes:

  • Change oversized columns to TEXT or BLOB types.

  • Remove or rename the duplicate key.

  • Use prefix indexes for long string columns, for example: KEY(col(255)).