Data Transmission Service (DTS) supports migrating data from PostgreSQL databases, such as self-managed PostgreSQL databases or ApsaraDB RDS for PostgreSQL instances, to ApsaraDB for SelectDB for large-scale data analytics. This topic walks you through the full migration workflow, covering schema migration, full data migration, and optional incremental data migration.
Prerequisites
Before you begin, make sure you have:
-
A destination ApsaraDB for SelectDB instance with more storage space than the source RDS PostgreSQL instance. See Create an instance.
-
A privileged database account on the source RDS PostgreSQL instance that owns the database to be migrated. See Create an account and Create a database.
-
A database account on the destination SelectDB instance with the following permissions: Usage_priv, Select_priv, Load_priv, Alter_priv, Create_priv, and Drop_priv. See Cluster Permission Management and Basic Permission Management.
Choose a migration type
DTS supports two migration strategies. Use the following guidance to choose the right one before you configure the task.
| Migration type | When to use | Billing |
|---|---|---|
| Schema migration + full data migration | One-time migration with acceptable downtime. Stop writes to the source before migrating. | Free |
| Schema migration + full migration + incremental migration | Zero-downtime migration. DTS keeps the destination in sync while the source continues to receive writes. | Incremental migration is charged. See Billing overview. |
If you select incremental data migration, do not write new data to the source instance during the migration period, as this may cause data inconsistency.
Limitations
Review all limitations that apply to your scenario before you start.
Source database requirements
-
Bandwidth: The server that hosts the source database must have sufficient outbound bandwidth. Otherwise, the data migration speed is affected.
-
Tables with primary keys or UNIQUE constraints: Ensure that the table fields are unique. Otherwise, duplicate data may exist in the destination database.
If the destination table that receives data is not created by DTS (that is, Schema Migration is not selected), ensure that the destination table has the same primary key or non-null UNIQUE constraint as the source table. Otherwise, duplicate data may appear in the destination database.
-
Tables with no primary key or UNIQUE constraint: Select Schema Migration when configuring the task and set the table engine to duplicate.
-
Database name: The database name cannot contain a hyphen (-). For example,
dts-testdatais not supported. -
Table count: When migrating at the table level with column name mapping, a single task supports a maximum of 1,000 tables. For larger migrations, split the tables into multiple tasks or migrate the entire database.
-
DDL operations: Do not perform DDL operations on the source database during full data migration.
-
Data size: If a single row of incremental change data exceeds 256 MB, the migration instance fails and cannot be recovered. You must reconfigure the migration instance.
-
Write-Ahead Logging (WAL):
-
Set
wal_leveltological. -
For incremental-only migration: retain WAL logs for more than 24 hours.
-
For full + incremental migration: retain WAL logs for at least 7 days. You can change the log retention period to more than 24 hours after the full migration is complete.
ImportantIf WAL log retention is shorter than required by DTS and the task fails due to missing logs, this is not covered by the DTS Service-Level Agreement (SLA).
-
-
Logical Replication Slot Failover: The RDS PostgreSQL instance must support and have Logical Replication Slot Failover enabled. See Logical Replication Slot Failover.
-
Long-running transactions: If the source database has long-running transactions and the task includes incremental migration, WAL data accumulates until the transaction commits. Monitor source disk space to avoid running out.
-
Major version upgrades: Do not upgrade the major version of the source database while the migration instance is running. Doing so causes the instance to fail permanently.
Destination SelectDB requirements
-
Tables must use the Unique or Duplicate engine. See Data Model for guidance.
-
If the destination table uses the Unique engine, all unique keys of the destination table must also exist in the source table and be included in the migration objects.
-
Database and table names must start with a letter. Use the object name mapping feature to rename objects that do not meet this requirement.
-
Object names containing Chinese characters must be renamed to ASCII equivalents using object name mapping. Otherwise, the task may fail.
-
One migration instance can migrate only one database. To migrate multiple databases, configure a separate migration instance for each.
-
Do not add backend (BE) nodes to the SelectDB database during migration. If the task fails because of this, restart the migration instance to resume.
-
Do not create clusters in the destination SelectDB instance during migration. If the task fails because of this, restart the migration instance to resume.
-
DTS validates data content but does not validate metadata such as Sequences. Validate this metadata manually.
Incremental migration requirements
-
Run the following command on each table to be migrated before writing data to the source:
ALTER TABLE schema.table REPLICA IDENTITY FULL;Replace
schemaandtablewith the actual schema name and table name. Run this command during off-peak hours and do not lock the tables while running it to avoid deadlocks.If you skip the related precheck item, DTS automatically runs this command during instance initialization. This applies when the instance runs for the first time, or when the migration object granularity is set to Schema and a new table is created or an existing table is rebuilt using the RENAME command.
-
The supported SQL operations for incremental migration are:
Operation type SQL statements DML INSERT, UPDATE, DELETE DDL ADD COLUMN, DROP COLUMN -
DTS converts UPDATE and DELETE statements to INSERT statements for tables using the Duplicate engine.
-
DTS cannot migrate TimescaleDB extension tables or tables with cross-schema inheritance.
-
When migrating partitioned tables, include both the parent table and all child tables in the migration objects. The parent table itself does not store data, but excluding it causes data inconsistency.
-
For multi-table merge scenarios (multiple source tables to one destination table), all source tables must have identical schemas.
Special cases
| Scenario | Requirement |
|---|---|
| Source is ApsaraDB RDS for PostgreSQL | Do not change the instance endpoint or zone during migration. |
| Source is a self-managed PostgreSQL instance | A primary/secondary switchover causes the data migration task to fail. Also, make sure max_wal_senders and max_replication_slots are each greater than the total used replication slots plus the number of DTS instances you plan to create. |
| Source is Google Cloud Platform Cloud SQL for PostgreSQL | Use an account with cloudsqlsuperuser permissions. Migrate only objects this account can manage, or grant ownership with: GRANT <owner_of_object_to_migrate> TO <source_database_account_for_task> |
DTS internal behavior
During migration, DTS creates the following objects in the source database. Do not delete them — they are removed automatically when the DTS instance is released:
-
Temporary tables:
public.dts_pg_class,public.dts_pg_attribute,public.dts_pg_type,public.dts_pg_enum,public.dts_postgres_heartbeat,public.dts_ddl_command,public.dts_args_session,public.aliyun_dts_instance -
Replication slot (prefix:
dts_sync_): Used to fetch incremental logs from the last 15 minutes. DTS cleans up this slot when the migration fails or the instance is released.If you change the source database account password or remove DTS IP addresses from the allowlist during migration, the replication slot cannot be cleaned up automatically. Clean it up manually to prevent disk accumulation. If a primary/secondary failover occurs, log on to the secondary database to perform the cleanup.
Create a migration task
Step 1: Go to the Data Migration page
Use one of the following consoles to access the Data Migration page.
DTS console
-
Log on to the
-
In the left-side navigation pane, click Data Migration.
-
In the upper-left corner, select the region where the migration instance will reside.
DMS console
The actual steps may vary based on the mode and layout of the DMS console. See Simple mode and Customize the layout and style of the DMS console.
-
Log on to the
-
In the top navigation bar, move the pointer over Data + AI > DTS (DTS) > Data Migration.
-
From the drop-down list to the right of Data Migration Tasks, select the region where the migration instance will reside.
Step 2: Configure source and destination databases
Click Create Task to open the task configuration page, then configure the following parameters.
Task name
| Parameter | Description |
|---|---|
| Task Name | DTS generates a name automatically. Specify a descriptive name to identify the task. The name does not need to be unique. |
Source database
| Parameter | Description |
|---|---|
| Select Existing Connection | If the source instance is already registered with DTS, select it from the list. DTS fills in the remaining parameters automatically. Otherwise, configure the parameters below. |
| Database Type | Select PostgreSQL. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | Select the region where the source RDS PostgreSQL instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No if the source and destination instances belong to the same Alibaba Cloud account. |
| Instance ID | Select the ID of the source RDS PostgreSQL instance. |
| Database Name | Enter the name of the database that contains the objects to migrate. |
| Database Account | Enter the database account of the source instance. See Prerequisites for required permissions. |
| Database Password | Enter the password for the database account. |
Destination database
| Parameter | Description |
|---|---|
| Select Existing Connection | If the destination instance is already registered with DTS, select it from the list. DTS fills in the remaining parameters automatically. Otherwise, configure the parameters below. |
| Database Type | Select SelectDB. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | Select the region where the destination SelectDB instance resides. |
| Replicate Data Across Alibaba Cloud Accounts | Select No if both instances belong to the same Alibaba Cloud account. |
| Instance ID | Select the ID of the destination SelectDB instance. |
| Database Account | Enter the database account of the destination instance. See Prerequisites for required permissions. |
| Database Password | Enter the password for the database account. |
Step 3: Test connectivity and configure objects
-
Click Test Connectivity and Proceed.
Make sure the CIDR blocks of DTS servers are added to the security settings of both the source and destination databases. See Add DTS server IP addresses to a whitelist.
-
On the Configure Objects page, set the following parameters:
Parameter Description Migration Types Select based on your migration strategy. See Choose a migration type. For zero-downtime migration, select Schema Migration, Full Data Migration, and Incremental Data Migration. For one-time migration, select Schema Migration and Full Data Migration. Processing Mode of Conflicting Tables Precheck and Report Errors (default): The task fails during precheck if tables with the same name exist in the destination. Ignore Errors and Proceed: DTS skips the check. Use with caution — data inconsistency may occur if schemas differ. Capitalization of Object Names in Destination Instance Determines how database, table, and column names are capitalized in the destination. DTS default policy is selected by default. See Specify the capitalization of object names in the destination instance. Source Objects Select objects to migrate at the schema or table level, then click the icon to add them to Selected Objects. Selected Objects Right-click an object to rename it, set filter conditions, or select SQL operations for incremental migration. To set the bucket_countparameter, right-click a table, go to Parameter Settings, enable the setting, and specify a value. To remove an object, click it and then click the remove icon.- The
bucket_countparameter must be a positive integer. The default value is auto. - If you rename an object using object name mapping, other objects that depend on it may fail to migrate. - To filter rows, right-click a table in Selected Objects and specify WHERE conditions. See Specify filter conditions. -
Click Next: Advanced Settings and configure the following optional parameters:
Parameter Description Dedicated Cluster for Task Scheduling By default, tasks run on the shared cluster. For higher stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. Retry Time for Failed Connections How long DTS retries before marking the task failed due to connection issues. Range: 10–1,440 minutes. Default: 720 minutes. Set to at least 30 minutes. Retry Time for Other Issues How long DTS retries before failing due to DDL or DML errors. Range: 1–1,440 minutes. Default: 10 minutes. Must be less than Retry Time for Failed Connections. Enable Throttling for Full Data Migration Limits read/write throughput during full migration to reduce database load. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Enable Throttling for Incremental Data Migration Limits throughput during incremental migration. Configure RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s). Environment Tag Optional. Tag the instance for environment identification. Configure ETL Select Yes to enable the extract, transform, and load (ETL) feature. See Configure ETL in a data migration or data synchronization task. Monitoring and Alerting Select Yes to receive alerts when the task fails or migration latency exceeds a threshold. See Configure monitoring and alerting. -
(Optional) Click Next: Configure Database and Table Fields to set the Primary Key Column, Distribution Key, and Engine for destination tables.
- This step is available only if you selected Schema Migration for Migration Types. Set Definition Status to All to edit all tables. - The Primary Key Column can be a composite primary key. Select one or more columns from the Primary Key Column as the Distribution Key. - For tables without a primary key or UNIQUE constraint, set Engine to duplicate. Otherwise, the migration instance may fail or data may be lost.
Step 4: Run the precheck
-
Click Next: Save Task Settings and Precheck.
To preview the API parameters for this task configuration, move the pointer over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
-
Wait for the precheck to complete. If any items fail:
-
Click View Details next to each failed item, address the issue, then click Precheck Again.
-
For alert items that can be ignored, click Confirm Alert Details, then click Ignore > OK > Precheck Again.
ImportantIgnoring alert items may cause data inconsistency. Proceed with caution.
-
Step 5: Purchase the instance and start migration
-
Wait until Success Rate reaches 100%, then click Next: Purchase Instance.
-
On the Purchase Instance page, configure the instance class:
Parameter Description Resource Group The resource group for the migration instance. Default: default resource group. See What is Resource Management? Instance Class Controls migration speed. See Instance classes of data migration instances. -
Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start > OK.
Verify migration status
After the task starts, go to the Data Migration page to monitor progress:
-
Full migration only: The task stops automatically when complete. The Status changes to Completed.
-
Full + incremental migration: The incremental phase runs continuously. The Status shows Running. Stop the task manually when you are ready to cut over to the destination.
Performance considerations
-
During full migration, run the task when the CPU load on both the source and destination is below 30%.
-
DTS uses batch synchronization for incremental migration. By default, DTS writes to each object at most once every 5 seconds, resulting in a typical synchronization latency within 10 seconds. To reduce latency, adjust the
selectdb.reservoir.timeout.millisecondsparameter in the DTS console. Valid range: 1,000–10,000 milliseconds.Reducing the batch interval increases write frequency to the destination, which may raise destination load and write response time. Adjust based on your destination's capacity.
-
If the migration instance fails, DTS helpdesk attempts to recover it within 8 hours. Recovery may involve restarting the instance or adjusting its parameters. Only DTS instance parameters are modified — database parameters are not changed.
Data type mappings
The following table shows how PostgreSQL data types map to SelectDB data types after migration.
| Category | PostgreSQL data type | SelectDB data type | Notes |
|---|---|---|---|
| Numeric | SMALLINT | SMALLINT | |
| INTEGER | INT | ||
| BIGINT | BIGINT | ||
| DECIMAL | DECIMAL | ||
| NUMERIC | DECIMAL | ||
| REAL | DOUBLE | ||
| DOUBLE | DOUBLE | ||
| SMALLSERIAL | SMALLINT | ||
| SERIAL | INT | ||
| BIGSERIAL | BIGINT | ||
| Monetary | MONEY | STRING | |
| Character | CHAR(n), VARCHAR(n) | VARCHAR | Converted to VARCHAR(4*n) to prevent data loss. If no length is specified, defaults to VARCHAR(65533). If length exceeds 65533, converted to STRING. |
| TEXT | STRING | ||
| Binary | BYTEA | STRING | |
| Date and time | TIMESTAMP [(P)] WITHOUT TIME ZONE | DATETIMEV2 | |
| TIMESTAMP [(P)] WITH TIME ZONE | DATETIMEV2 | ||
| DATE | DATEV2 | ||
| TIME [(P)] WITHOUT TIME ZONE | VARCHAR(50) | ||
| TIME [(P)] WITH TIME ZONE | VARCHAR(50) | ||
| INTERVAL [FIELDS] [(P)] | STRING | ||
| Boolean | BOOLEAN | BOOLEAN | |
| Geometric | POINT, LINE, LSEG, BOX, PATH, POLYGON, CIRCLE | STRING | |
| Network address | CIDR, INET, MACADDR, MACADDR8 | STRING | |
| Text search | TSVECTOR | STRING | |
| XML | XML | STRING | |
| JSON | JSON | JSON |
Additional columns for the Duplicate model
For destination tables using the Duplicate model, DTS automatically adds the following columns. Use these columns to identify and remove duplicate data after retries or restarts.
| Column | Data type | Default value | Description |
|---|---|---|---|
_is_deleted |
Int | 0 | 0 for INSERT and UPDATE operations; 1 for DELETE operations. |
_version |
Bigint | 0 | 0 for full migration. For incremental migration, the timestamp in seconds from the source binary log. |
_record_id |
Bigint | 0 | 0 for full migration. For incremental migration, a unique auto-incrementing ID that identifies each log entry. |
Duplicate data may appear in the following cases:
-
A retry operation occurred in the migration instance.
-
The migration instance was restarted.
-
Two or more DML operations were performed on the same row after the migration started.
For tables using the Duplicate engine, DTS converts UPDATE and DELETE statements to INSERT statements.
What's next
-
To map object names between source and destination, see Map object names.
-
To manage database connections registered with DTS, see Manage database connections.
-
To modify instance parameters after the task starts, see Modify instance parameters.