Synchronize data from an ApsaraDB RDS for SQL Server instance to an AnalyticDB for PostgreSQL instance-AnalyticDB(AnalyticDB)-阿里云帮助中心

Data Transmission Service (DTS) synchronizes data from an ApsaraDB RDS for SQL Server instance to an AnalyticDB for PostgreSQL instance, simplifying the centralized analysis of your enterprise data.

Prerequisites

Create an ApsaraDB RDS for SQL Server instance. For information about supported versions, see Data synchronization scenarios.

Important
The source ApsaraDB RDS for SQL Server instance can run SQL Server 2012, 2014, 2016, 2017, or 2019.
Create an AnalyticDB for PostgreSQL instance.
The tables to be synchronized in the source ApsaraDB RDS for SQL Server instance must have a primary key.
The destination tables in the AnalyticDB for PostgreSQL instance must have a primary key or a unique index.

Usage notes

A full data migration consumes read and write resources on both the source and destination databases, increasing their server load. This added load can degrade performance or cause service unavailability, especially if your databases have poor performance, low specifications, or are already busy (for example, with many slow SQL queries, tables without primary keys, or deadlocks in the destination database). Before you start the migration, evaluate the performance of both databases and perform the migration during off-peak hours, for example, when CPU utilization is below 30%.
To ensure accurate monitoring of data synchronization latency, DTS adds a heartbeat table nameddts_log_heart_beat to the source database.
In this scenario, DTS supports schema initialization for the following objects: schemas, tables, views, functions, and procedures.

Warning
As this is a heterogeneous synchronization, data types do not have a one-to-one mapping. Carefully evaluate the impact of data type mappings on your business. For more information, see Data type mappings for schema initialization.
You can select objects for synchronization only at the table level. Append-only (AO) tables are not supported. You can modify column mappings. If you use column mapping for partial table synchronization, or if the source and destination table schemas are inconsistent, data from source columns that do not exist in the destination table is lost.
The following data types cannot be synchronized: TIMESTAMP, CURSOR, ROWVERSION, HIERARCHYID, SQL_VARIANT, SPATIAL GEOMETRY, SPATIAL GEOGRAPHY, and TABLE.

Billing

Synchronization type	Pricing
Schema synchronization and full data synchronization	Free of charge.
Incremental data synchronization	Charged. For more information, see Billing overview.

Supported SQL operations

DDL operation: ADD COLUMN

Note
Transactional DDL operations are not synchronized.
DML operations: INSERT, UPDATE, DELETE

Database account permissions

Database	Required permissions	Authorization method
ApsaraDB RDS for SQL Server instance	Owner permissions on the database to be synchronized.	Modify account permissions
AnalyticDB for PostgreSQL instance	LOGIN SELECT, CREATE, INSERT, UPDATE, and DELETE on the destination tables. CONNECT and CREATE on the destination database. CREATE on the destination schema. COPY (for memory-based batch copy) Note You can also use the initial account of the AnalyticDB for PostgreSQL instance.	User permission management

Procedure

Purchase a data synchronization instance. For more information, see Purchase procedure.

Note
When you purchase the instance, set SQL Server to SQL Server, Destination Instance to AnalyticDB for PostgreSQL, and One-way Synchronization to One-way Synchronization.
Log in to the DTS console.
In the left-side navigation pane, click Data Synchronization.
At the top of the Synchronization Tasks page, select the region where your destination instance is located.
Find the data synchronization instance that you purchased and click Configure Synchronization Channel.

Configure the source and destination instances.

Category	Setting	Description
N/A	Synchronization Task Name	DTS automatically generates a task name. Specify a descriptive name for easy identification. The name does not need to be unique.
Source Instance Information	Instance Type	Select RDS Instance.
	Instance Region	The region of the source instance that you selected when you purchased the DTS instance. This parameter cannot be changed.
	Instance ID	Select the ID of the source ApsaraDB RDS for SQL Server instance.
	Database Account	Enter the database account of the ApsaraDB RDS for SQL Server instance. For permission requirements, see Database account permissions.
	Database Password	Enter the password for the database account.
	Connection Mode	Select Non-encrypted or SSL-encrypted based on your requirements. If you select SSL-encrypted, you must enable SSL encryption for the RDS instance in advance. For more information, see Configure SSL encryption.
Destination Instance Information	Instance Type	Select AnalyticDB for PostgreSQL.
	Instance Region	The region of the destination instance that you selected when you purchased the DTS instance. This parameter cannot be changed.
	Instance ID	Select the ID of the destination AnalyticDB for PostgreSQL instance.
	Database Name	Enter the name of the destination database.
	Database Account	Enter the database account of the AnalyticDB for PostgreSQL instance. For permission requirements, see Database account permissions.
	Database Password	Enter the password for the database account.

In the lower-right corner of the page, click Set Whitelist and Next.

If the source or destination database is an Alibaba Cloud database instance, such as an ApsaraDB RDS for MySQL or ApsaraDB for MongoDB instance, DTS automatically adds the CIDR blocks of DTS servers to the IP address whitelist of the instance. If the source or destination database is a self-managed database hosted on an Elastic Compute Service (ECS) instance, DTS automatically adds the CIDR blocks of DTS servers to the security group rules of the ECS instance, and you must make sure that the ECS instance can access the database. If the self-managed database is hosted on multiple ECS instances, you must manually add the CIDR blocks of DTS servers to the security group rules of each ECS instance. If the source or destination database is a self-managed database that is deployed in a data center or provided by a third-party cloud service provider, you must manually add the CIDR blocks of DTS servers to the IP address whitelist of the database to allow DTS to access the database. For more information, see DTS server IP whitelist.

Warning
Adding the public IP address blocks of the DTS service, either automatically or manually, may pose security risks. Using this product, you acknowledge that you understand and accept the potential security risks and that you must implement basic security measures. These measures include, but are not limited to, strengthening password security, limiting the ports open to each CIDR block, using authentication for internal API calls, and regularly checking and restricting unnecessary CIDR blocks. Alternatively, you can connect through a private network using a leased line, VPN Gateway, or Smart Access Gateway.

Configure the synchronization policy and select the objects to synchronize.

For the synchronization operation types, select Insert, Update, and Delete. From the source objects, select the database to synchronize (such as dtstest) and move it to the selected objects pane. For name mapping, select Do not batch rename databases and tables, set the retry duration for connection failures of the source or destination database to 720 minutes, and select No for adding quotation marks to target objects.

Setting	Description
Initialize Synchronization	Schema Initialization, Full Data Initialization, and Incremental Data Initialization are selected by default. After a successful precheck, DTS performs a one-time synchronization of schemas and existing data of the selected objects from the source instance to the destination database. This creates the baseline for subsequent incremental data synchronization.
Processing Mode In Existed Target Table	Precheck and Report Errors: DTS checks whether the destination database contains tables with the same names as the source tables. If no tables with the same names exist, the precheck is passed. If tables with the same names exist, DTS reports an error and does not start the task. Note If a table with a conflicting name in the destination database cannot be deleted or renamed, you can change its name in the destination database. For more information, see Set object names in the destination instance. Ignore Errors and Proceed: DTS skips the check for tables with the same name in the destination database. Warning If you select Ignore Errors and Proceed, data inconsistency may occur and expose your business to risks. For example: If the table schemas are identical, and the destination database encounters a record with the same primary key value as a record in the source database, the record in the destination database is retained. The record from the source database is not synchronized. If the table schemas are different, data initialization may fail, only partial data may be synchronized, or the synchronization may fail.
Merge Multi Tables	Yes: In OLTP systems, sharding is often used to improve performance. However, AnalyticDB for PostgreSQL can efficiently query massive single tables. This feature lets you merge multiple sharded source tables with the same schema (sharded tables) into a single table in the destination AnalyticDB for PostgreSQL instance. Note After selecting multiple source tables, you must use the object name mapping feature to map them to the same table name in the destination AnalyticDB for PostgreSQL instance. For more information, see Set object names in the destination instance. You must add a `__dts_data_source` column of the TEXT type to the destination table to store the data source information. DTS writes the column value in the format of `<DTS data synchronization instance ID>:<source database name>.<source schema name>.<source table name>` to identify the source of the table. Example: `dts********:dtstestdata.testschema.customer1`. The table merging feature is configured at the task level, not the table level. If you need to merge some tables but not others, you must create two separate data synchronization tasks. No: This is the default option.
Select the operation types	Select the operation types to synchronize based on your business requirements. By default, all operation types are selected.
Select the objects to be synchronized	In the Source Objects box, click the objects that you want to synchronize, and then click the icon to move them to the Selected Objects box. In this heterogeneous synchronization scenario, you can only select objects at the table level, and other objects such as views, triggers, and stored procedures are not synchronized to the destination database. Note By default, the names of synchronized objects remain unchanged. If you want the synchronized objects to have different names in the destination instance, use the object name mapping feature. For more information, see Set object names in the destination instance. If you set Merge Tables to Yes, you must use the object name mapping feature to map the selected source tables to a single table name in the destination AnalyticDB for PostgreSQL instance.
Add quotation marks to the target object	Select whether to add quotation marks to the names of destination objects. If you select Yes and one of the following conditions is met, DTS adds single or double quotation marks to the destination object during the schema synchronization and incremental data synchronization phases: The business environment of the source database is case-sensitive and uses mixed case. A source table name does not start with a letter and contains characters other than letters, numbers, or special characters. Note The supported special characters are underscores (_), number signs (#), and dollar signs ($). The schema, table, or column name to be synchronized is a keyword, reserved word, or invalid character in the destination database. Note If you choose to add quotation marks, you must use the quoted object name when you query data after the synchronization is complete.
Rename databases and tables	To change the name of a synchronization object in the destination instance, use the object name mapping feature. For more information, see Object name mapping.
Retry time for failed connections	If DTS cannot connect to the source or destination database, it retries the connection for 720 minutes (12 hours) by default. You can also specify a custom retry duration. If DTS reconnects to the databases within the specified duration, the data synchronization task automatically resumes. Otherwise, the task fails. Note You are charged for the DTS instance during the connection retry period. We recommend that you specify a custom retry duration based on your business requirements or release the DTS instance promptly after the source and destination database instances are released.

Configure the table type, primary key column, and distribution key for the tables in the destination AnalyticDB for PostgreSQL instance.

For Type, select dimension table or partitioned table. For each table, select a Primary Key Column and set a Distribution Key. After completing the configuration, the Definition Status column displays Defined. Then, click Precheck and Start.

Note
For more information about primary key columns and distribution keys, see Table constraint definitions and Table distribution key definitions.
After you complete the preceding configurations, click Precheck and Start in the lower-right corner of the page.
Note
- DTS performs a precheck before the synchronization task starts. The task can begin only after it passes this check.
- If the precheck fails, click the icon next to the failed item to view details.
  
  Fix the issues based on the cause and run the precheck again.
  
  If you do not need to fix an alert item, you can click Ignore or Ignore and Re-precheck to skip the alert item and run the precheck again.
After Precheck Passed is displayed in the Precheck dialog box, close the Precheck dialog box, and the synchronization job will start.
Wait for the synchronization channel to initialize. The task will then enter the Synchronizing state.

On the Data Synchronization page, you can view the status of the data synchronization task.

FAQ

Q: How do I find the destination tables in the AnalyticDB for PostgreSQL instance?

A: The schema initialization of DTS synchronizes the schema of the source database to the destination database. In this example, you can find the customer table and the Student table in the dbo schema of the dtstestdata database in the destination instance. Connect to the AnalyticDB for PostgreSQL instance by using pgAdmin III. In the Object browser on the left, expand Server Groups > the destination connection > Databases > the destination database > Schemas > dbo > Tables. You can then view the synchronized destination tables, such as Student and customer. The Properties tab on the right also lists the corresponding table name and Owner information.