Synchronize data from RDS for SQL Server to AnalyticDB for PostgreSQL

更新时间:
复制 MD 格式

Data Transmission Service (DTS) synchronizes data from an ApsaraDB RDS for SQL Server instance to an AnalyticDB for PostgreSQL instance, simplifying the centralized analysis of your enterprise data.

Prerequisites

Usage notes

  • A full data migration consumes read and write resources on both the source and destination databases, increasing their server load. This added load can degrade performance or cause service unavailability, especially if your databases have poor performance, low specifications, or are already busy (for example, with many slow SQL queries, tables without primary keys, or deadlocks in the destination database). Before you start the migration, evaluate the performance of both databases and perform the migration during off-peak hours, for example, when CPU utilization is below 30%.

  • To ensure accurate monitoring of data synchronization latency, DTS adds a heartbeat table nameddts_log_heart_beat to the source database.

  • In this scenario, DTS supports schema initialization for the following objects: schemas, tables, views, functions, and procedures.

    Warning

    As this is a heterogeneous synchronization, data types do not have a one-to-one mapping. Carefully evaluate the impact of data type mappings on your business. For more information, see Data type mappings for schema initialization.

  • You can select objects for synchronization only at the table level. Append-only (AO) tables are not supported. You can modify column mappings. If you use column mapping for partial table synchronization, or if the source and destination table schemas are inconsistent, data from source columns that do not exist in the destination table is lost.

  • The following data types cannot be synchronized: TIMESTAMP, CURSOR, ROWVERSION, HIERARCHYID, SQL_VARIANT, SPATIAL GEOMETRY, SPATIAL GEOGRAPHY, and TABLE.

Billing

Synchronization type

Pricing

Schema synchronization and full data synchronization

Free of charge.

Incremental data synchronization

Charged. For more information, see Billing overview.

Supported SQL operations

  • DDL operation: ADD COLUMN

    Note

    Transactional DDL operations are not synchronized.

  • DML operations: INSERT, UPDATE, DELETE

Database account permissions

Database

Required permissions

Authorization method

ApsaraDB RDS for SQL Server instance

Owner permissions on the database to be synchronized.

Modify account permissions

AnalyticDB for PostgreSQL instance

  • LOGIN

  • SELECT, CREATE, INSERT, UPDATE, and DELETE on the destination tables.

  • CONNECT and CREATE on the destination database.

  • CREATE on the destination schema.

  • COPY (for memory-based batch copy)

Note

You can also use the initial account of the AnalyticDB for PostgreSQL instance.

User permission management

Procedure

  1. Purchase a data synchronization instance. For more information, see Purchase procedure.

    Note

    When you purchase the instance, set SQL Server to SQL Server, Destination Instance to AnalyticDB for PostgreSQL, and One-way Synchronization to One-way Synchronization.

  2. Log in to the DTS console.

  3. In the left-side navigation pane, click Data Synchronization.

  4. At the top of the Synchronization Tasks page, select the region where your destination instance is located.

  5. Find the data synchronization instance that you purchased and click Configure Synchronization Channel.

  6. Configure the source and destination instances.

    Category

    Setting

    Description

    N/A

    Synchronization Task Name

    DTS automatically generates a task name. Specify a descriptive name for easy identification. The name does not need to be unique.

    Source Instance Information

    Instance Type

    Select RDS Instance.

    Instance Region

    The region of the source instance that you selected when you purchased the DTS instance. This parameter cannot be changed.

    Instance ID

    Select the ID of the source ApsaraDB RDS for SQL Server instance.

    Database Account

    Enter the database account of the ApsaraDB RDS for SQL Server instance. For permission requirements, see Database account permissions.

    Database Password

    Enter the password for the database account.

    Connection Mode

    Select Non-encrypted or SSL-encrypted based on your requirements. If you select SSL-encrypted, you must enable SSL encryption for the RDS instance in advance. For more information, see Configure SSL encryption.

    Destination Instance Information

    Instance Type

    Select AnalyticDB for PostgreSQL.

    Instance Region

    The region of the destination instance that you selected when you purchased the DTS instance. This parameter cannot be changed.

    Instance ID

    Select the ID of the destination AnalyticDB for PostgreSQL instance.

    Database Name

    Enter the name of the destination database.

    Database Account

    Enter the database account of the AnalyticDB for PostgreSQL instance. For permission requirements, see Database account permissions.

    Database Password

    Enter the password for the database account.

  7. In the lower-right corner of the page, click Set Whitelist and Next.

    If the source or destination database is an Alibaba Cloud database instance, such as an ApsaraDB RDS for MySQL or ApsaraDB for MongoDB instance, DTS automatically adds the CIDR blocks of DTS servers to the IP address whitelist of the instance. If the source or destination database is a self-managed database hosted on an Elastic Compute Service (ECS) instance, DTS automatically adds the CIDR blocks of DTS servers to the security group rules of the ECS instance, and you must make sure that the ECS instance can access the database. If the self-managed database is hosted on multiple ECS instances, you must manually add the CIDR blocks of DTS servers to the security group rules of each ECS instance. If the source or destination database is a self-managed database that is deployed in a data center or provided by a third-party cloud service provider, you must manually add the CIDR blocks of DTS servers to the IP address whitelist of the database to allow DTS to access the database. For more information, see DTS server IP whitelist.

    Warning

    Adding the public IP address blocks of the DTS service, either automatically or manually, may pose security risks. Using this product, you acknowledge that you understand and accept the potential security risks and that you must implement basic security measures. These measures include, but are not limited to, strengthening password security, limiting the ports open to each CIDR block, using authentication for internal API calls, and regularly checking and restricting unnecessary CIDR blocks. Alternatively, you can connect through a private network using a leased line, VPN Gateway, or Smart Access Gateway.

  8. Configure the synchronization policy and select the objects to synchronize.

    For the synchronization operation types, select Insert, Update, and Delete. From the source objects, select the database to synchronize (such as dtstest) and move it to the selected objects pane. For name mapping, select Do not batch rename databases and tables, set the retry duration for connection failures of the source or destination database to 720 minutes, and select No for adding quotation marks to target objects.

    Setting

    Description

    Initialize Synchronization

    Schema Initialization, Full Data Initialization, and Incremental Data Initialization are selected by default. After a successful precheck, DTS performs a one-time synchronization of schemas and existing data of the selected objects from the source instance to the destination database. This creates the baseline for subsequent incremental data synchronization.

    Processing Mode In Existed Target Table

    • Precheck and Report Errors: DTS checks whether the destination database contains tables with the same names as the source tables. If no tables with the same names exist, the precheck is passed. If tables with the same names exist, DTS reports an error and does not start the task.

      Note

      If a table with a conflicting name in the destination database cannot be deleted or renamed, you can change its name in the destination database. For more information, see Set object names in the destination instance.

    • Ignore Errors and Proceed: DTS skips the check for tables with the same name in the destination database.

      Warning

      If you select Ignore Errors and Proceed, data inconsistency may occur and expose your business to risks. For example:

      • If the table schemas are identical, and the destination database encounters a record with the same primary key value as a record in the source database, the record in the destination database is retained. The record from the source database is not synchronized.

      • If the table schemas are different, data initialization may fail, only partial data may be synchronized, or the synchronization may fail.

    Merge Multi Tables

    • Yes: In OLTP systems, sharding is often used to improve performance. However, AnalyticDB for PostgreSQL can efficiently query massive single tables. This feature lets you merge multiple sharded source tables with the same schema (sharded tables) into a single table in the destination AnalyticDB for PostgreSQL instance.

      Note
      • After selecting multiple source tables, you must use the object name mapping feature to map them to the same table name in the destination AnalyticDB for PostgreSQL instance. For more information, see Set object names in the destination instance.

      • You must add a __dts_data_source column of the TEXT type to the destination table to store the data source information. DTS writes the column value in the format of <DTS data synchronization instance ID>:<source database name>.<source schema name>.<source table name> to identify the source of the table. Example: dts********:dtstestdata.testschema.customer1.

      • The table merging feature is configured at the task level, not the table level. If you need to merge some tables but not others, you must create two separate data synchronization tasks.

    • No: This is the default option.

    Select the operation types

    Select the operation types to synchronize based on your business requirements. By default, all operation types are selected.

    Select the objects to be synchronized

    In the Source Objects box, click the objects that you want to synchronize, and then click the 向右小箭头 icon to move them to the Selected Objects box.

    In this heterogeneous synchronization scenario, you can only select objects at the table level, and other objects such as views, triggers, and stored procedures are not synchronized to the destination database.

    Note
    • By default, the names of synchronized objects remain unchanged. If you want the synchronized objects to have different names in the destination instance, use the object name mapping feature. For more information, see Set object names in the destination instance.

    • If you set Merge Tables to Yes, you must use the object name mapping feature to map the selected source tables to a single table name in the destination AnalyticDB for PostgreSQL instance.

    Add quotation marks to the target object

    Select whether to add quotation marks to the names of destination objects. If you select Yes and one of the following conditions is met, DTS adds single or double quotation marks to the destination object during the schema synchronization and incremental data synchronization phases:

    • The business environment of the source database is case-sensitive and uses mixed case.

    • A source table name does not start with a letter and contains characters other than letters, numbers, or special characters.

      Note

      The supported special characters are underscores (_), number signs (#), and dollar signs ($).

    • The schema, table, or column name to be synchronized is a keyword, reserved word, or invalid character in the destination database.

    Note

    If you choose to add quotation marks, you must use the quoted object name when you query data after the synchronization is complete.

    Rename databases and tables

    To change the name of a synchronization object in the destination instance, use the object name mapping feature. For more information, see Object name mapping.

    Retry time for failed connections

    If DTS cannot connect to the source or destination database, it retries the connection for 720 minutes (12 hours) by default. You can also specify a custom retry duration. If DTS reconnects to the databases within the specified duration, the data synchronization task automatically resumes. Otherwise, the task fails.

    Note

    You are charged for the DTS instance during the connection retry period. We recommend that you specify a custom retry duration based on your business requirements or release the DTS instance promptly after the source and destination database instances are released.

  9. Configure the table type, primary key column, and distribution key for the tables in the destination AnalyticDB for PostgreSQL instance.

    For Type, select dimension table or partitioned table. For each table, select a Primary Key Column and set a Distribution Key. After completing the configuration, the Definition Status column displays Defined. Then, click Precheck and Start.

    Note

    For more information about primary key columns and distribution keys, see Table constraint definitions and Table distribution key definitions.

  10. After you complete the preceding configurations, click Precheck and Start in the lower-right corner of the page.

    Note
    • DTS performs a precheck before the synchronization task starts. The task can begin only after it passes this check.

    • If the precheck fails, click the 提示 icon next to the failed item to view details.

      • Fix the issues based on the cause and run the precheck again.

      • If you do not need to fix an alert item, you can click Ignore or Ignore and Re-precheck to skip the alert item and run the precheck again.

  11. After Precheck Passed is displayed in the Precheck dialog box, close the Precheck dialog box, and the synchronization job will start.

  12. Wait for the synchronization channel to initialize. The task will then enter the Synchronizing state.

    On the Data Synchronization page, you can view the status of the data synchronization task.查看同步作业状态

FAQ

Q: How do I find the destination tables in the AnalyticDB for PostgreSQL instance?

A: The schema initialization of DTS synchronizes the schema of the source database to the destination database. In this example, you can find the customer table and the Student table in the dbo schema of the dtstestdata database in the destination instance. Connect to the AnalyticDB for PostgreSQL instance by using pgAdmin III. In the Object browser on the left, expand Server Groups > the destination connection > Databases > the destination database > Schemas > dbo > Tables. You can then view the synchronized destination tables, such as Student and customer. The Properties tab on the right also lists the corresponding table name and Owner information.