Synchronize data from an RDS PostgreSQL instance to an OceanBase Database Oracle tenant

更新时间:
复制 MD 格式

This topic describes how to synchronize data from an RDS PostgreSQL instance to an OceanBase Database Oracle tenant.

Prerequisites

  • Data Transmission Service (DTS) has the required permissions to access cloud resources. For more information, see Authorize DTS to access cloud resources.

  • You have created a privileged account for the data synchronization task in the source RDS PostgreSQL instance. For more information, see PostgreSQL data source.

  • You have created a database user for the data synchronization task in the destination OceanBase Database Oracle tenant and granted the required permissions to the user. For more information, see Create a database user.

  • To perform incremental synchronization, you must complete the following prerequisite operations:

    • During incremental synchronization, DTS does not support the automatic synchronization of DDL operations. To perform a DDL operation on a table to be synchronized, you must first manually execute the DDL statement on the destination and then execute it on the source RDS PostgreSQL instance.

      To ensure that incremental DML operations can be correctly parsed after you execute a DDL statement, you must create a trigger and a table to record the DDL statement. For more information, see Create a trigger.

    • If you select incremental synchronization, you must set the wal_level parameter to logical. For more information, see Modify the log level of an RDS PostgreSQL instance.

Limits

  • RDS PostgreSQL V11.x and V12.x instances are supported.

  • DTS can synchronize only objects whose database, table, and column names are ASCII characters and do not contain special characters. Special characters include line breaks, spaces, and .|"'`()=;/&\.

  • DTS does not support synchronizing views, partitioned tables, unlogged tables, temporary tables, foreign keys, or check constraints from an RDS PostgreSQL instance to an OceanBase Database Oracle tenant.

  • Incremental synchronization is supported only from the primary database.

Notes

  • If you select incremental synchronization, the table-level replication identity REPLICA IDENTITY must meet the following requirements:

    • If you select sync objects on the Specify Objects tab, the specified tables must have primary keys, or their table-level replication identity REPLICA IDENTITY must be set to FULL. Otherwise, data updates and deletions fail.

    • If you select sync objects on the Match Rules tab, the RDS PostgreSQL instance must subscribe to all tables in the selected database. This includes selected, unselected, and new tables. All tables must have primary keys, or their table-level replication identity REPLICA IDENTITY must be set to FULL. Otherwise, data updates and deletions fail.

    To change the table-level replication identity REPLICA IDENTITY to FULL, run the following command.

    ALTER TABLE table_name REPLICA IDENTITY FULL;
  • When data is synchronized (schema synchronization or incremental DDL) from an RDS PostgreSQL instance to an OceanBase Database Oracle tenant, table and field names are converted to uppercase based on the default policy of DTS. For example, a source table named `a` is converted to `A` in the destination. In queries, you can refer to the table or field as `a`, `A`, or `"A"`, but not as `"a"`.

  • The incremental component of an RDS PostgreSQL instance automatically creates a publication and a slot. You must monitor the disk usage of the log files for the RDS PostgreSQL instance. By default, DTS sends a notification every 10 minutes to update the `confirmed_flush_lsn` of the slot to the LSN from 10 minutes ago. Therefore, each incremental component retains at least 10 minutes of RDS PostgreSQL instance log files.

    Note

    To change the notification interval or the retention period for log files that can be cleared, contact technical support.

    During data synchronization, if the log files of the RDS PostgreSQL instance cannot be cleared because a slot exists, you must completely delete the data synchronization task to clear the logs. Whether the log files of an RDS PostgreSQL instance can be reclaimed depends on whether the earliest `restart_lsn` among all slots is within the range of the log files.

  • If a table has neither a primary key nor a NOT NULL unique key, duplicate data may appear in the destination database.

  • If the source character set is UTF-8, we recommend that you use a compatible character set for the destination, such as UTF-8 or UTF-16. This prevents issues, such as garbled text in the destination, that can be caused by incompatible character sets.

  • Confirm that the synchronization precision for column types such as DECIMAL, FLOAT, or DOUBLE meets your expectations. If the precision of the destination field type is less than that of the source field type, data truncation may occur. This can cause data inconsistency between the source and destination.

  • If you change a unique index in the destination, you must restart the data synchronization task. Otherwise, data inconsistency may occur.

  • If the clocks between nodes are not synchronized, or if the clock on your computer is not synchronized with the server clock, the incremental synchronization latency may be inaccurate.

    For example, if a clock is ahead of the standard time, the latency may be a negative value. If a clock is behind the standard time, a delay may occur.

  • If you select only Incremental Synchronization when you create a data synchronization task, the source database must retain local incremental logs for more than 48 hours.

    If you select Full Synchronization and Incremental Synchronization when you create a data synchronization task, the source database must retain local incremental logs for at least 7 days. Otherwise, DTS may fail to retrieve the incremental logs, which can cause the sync task to fail or lead to data inconsistency.

  • If the source or destination contains table objects whose names differ only in case, the data migration result may not be as expected. This can happen if the source or destination is case-insensitive.

  • If a UNIQUE constraint column allows NULL values, data loss may occur. When multiple NULL values are synchronized from a PostgreSQL database to an OceanBase Database Oracle tenant, only the first NULL value is inserted successfully. Subsequent NULL values are discarded because of conflicts with the UNIQUE constraint.

Supported source and destination instance types

In the following table, OceanBase database Oracle tenant is abbreviated as OB_Oracle.

Source

Destination

PostgreSQL (RDS instance)

OB_Oracle (OceanBase cluster instance)

PostgreSQL (RDS instance)

OB_Oracle (Self-managed database in a VPC)

Data type mapping

RDS PostgreSQL instance

OceanBase database Oracle tenant

int

NUMBER(10)

smallint

NUMBER(5)

bigint

NUMBER(20)

decimal

NUMBER(p,s)

numeric

NUMBER(p,s)

real

BINARY_FLOAT

double precision

BINARY_DOUBLE

smallserial

NUMBER(5)

serial

NUMBER(10)

bigserial

NUMBER(20)

char

CHAR(n)

Note

The default length of a column of the CHAR data type is 1 byte. The maximum supported length is 2000 bytes.

varchar

VARCHAR2(n)

text

CLOB

timestamp

TIMESTAMP(p)

timestamp with time zone

TIMESTAMP(p) WITH TIME ZONE

time

DATE

time with time zone

TIMESTAMP(p) WITH TIME ZONE

boolean

NUMBER(1)

bytea

BLOB

citext

CLOB

tsvector

CLOB

Procedure

  1. Log on to the OceanBase Management Console and purchase a data sync task.

    For more information, see Purchase a data sync task.

  2. In the Data Transmission Service console, navigate to the Data Synchronization page, find the new sync task, and click Configure.

    image.png

    If you want to reuse the configurations of an existing task, click Reuse Configuration. For more information, see Reuse and clear the configurations of a data sync task.

  3. On the Choose Source and Destination page, configure the parameters.

    Parameter

    Description

    Sync Task Name

    Use a name that contains letters, digits, or underscores. The name cannot contain spaces and must be 64 characters or less in length.

    Source

    If you have created a PostgreSQL data source, select it from the drop-down list. If not, click Add Data Source in the drop-down list to create one. For more information about the parameters, see Create a PostgreSQL data source.

    Destination

    If you have created an OceanBase data source, select it from the drop-down list. If not, click Add Data Source in the drop-down list to create one. For more information about the parameters, see OceanBase data source.

    Tag (Optional)

    Click the text box and select a tag from the drop-down list. You can also click Manage Tags to create, modify, or delete a tag. For more information, see Manage data synchronization tasks using tags.

  4. Click Next. On the Choose Sync Type page, select the synchronization types for the task.

    image

    The synchronization types are Schema Synchronization, Full Synchronization, and Incremental Synchronization. Incremental Synchronization supports only DML Sync, which includes INSERT, DELETE, and UPDATE operations. You can customize the configuration as needed. For more information, see Customize DDL and DML operations.

  5. Click Next. On the Choose Sync Objects page, select the objects to synchronize.

    You can select sync objects on the Specify Objects and Match Rules tabs. This topic describes how to select sync objects on the Specify Objects tab. For more information about how to configure matching rules, see Configure and modify matching rules.

    Important
    • The names of the tables to be synchronized and their columns cannot contain Chinese characters.

    • If a database or table name contains "$$" characters, the creation of the data synchronization task is affected.

    image.png

    1. In the Choose Sync Objects section, select Specify Objects.

    2. In the Choose Sync Scope section, select the objects to synchronize from the Source Objects list. You can select tables from one or more databases as sync objects.

    3. Click > to add the selected objects to the Destination Objects list.

    DTS lets you import objects from a text file and perform operations on the destination objects, such as renaming, setting properties, and removing a single object or all objects.

    Note

    If you select objects on the Match Rules tab, the renaming function is covered by the matching rule syntax. You can only set filter conditions in the Actions column. For more information, see Configure and modify matching rules.

    Operation

    Step

    Import Objects

    1. In the list on the right, click Import Objects in the upper-right corner.

    2. In the dialog box that appears, click OK.

      Important

      Importing objects will overwrite your previous selections. Proceed with caution.

    3. In the Import Sync Objects dialog box, import the objects to be synchronized. You can import a CSV file to rename database tables, set row filtering conditions, and perform other operations. For more information, see Download and import sync object configurations.

    4. Click Validate.

    5. After the validation is successful, click OK.

    Rename

    Data Transmission Service lets you rename sync objects. For more information, see Rename database tables.

    Settings

    Data Transmission Service lets you use a WHERE clause for row filtering and view the column information of sync objects.

    1. In the list on the right, hover your mouse over the target object.

    2. Click Settings.

    3. In the Settings dialog box, you can perform the following operations:

      • In the Row Filtering section, enter a standard WHERE clause from a SQL statement in the text box to configure row filtering. For more information, see Filter data with SQL conditions.

      • In the View Columns section, view the column information of the sync object.

    4. Click OK.

    Remove/Remove All

    During data mapping, Data Transmission Service lets you remove one or more objects that have been temporarily selected for the destination.

    • Remove a single sync object

      In the list on the right, hover your mouse over the target object and click the Remove button that appears to remove the object.

    • Remove all sync objects

      In the list on the right, click Remove All in the upper-right corner. In the dialog box that appears, click OK to remove all sync objects.

  6. Click Next. On the Sync Options page, configure the parameters.

    • Full synchronization

      The following parameters are displayed only if you select Full Synchronization on the Choose Sync Type page.

      image

      Parameter

      Description

      Read Concurrency

      This parameter specifies the number of concurrent threads for reading data from the source during full synchronization. The maximum value is 512. A high concurrency can increase the load on the source database and affect your business.

      Write Concurrency

      This parameter specifies the number of concurrent threads for writing data to the destination during full synchronization. The maximum value is 512. A high concurrency can increase the load on the destination database and affect your business.

      Full Sync Rate Limit

      You can enable the full synchronization rate limit as needed. If you enable it, set the RPS and BPS. RPS is the maximum number of data rows that can be synchronized to the destination per second during full synchronization. BPS is the maximum volume of data that can be synchronized to the destination per second during full synchronization.

      Note

      The RPS and BPS settings only serve as a rate-limiting capability. The actual performance of full synchronization is affected by factors such as the source, the destination, and the instance specifications.

      Policy for Existing Records in Destination Table

      • Select Ignore: If data exists in the destination table and a conflict occurs between the existing data and the data being written, Data Transmission Service logs the conflict and keeps the existing data.

        Important

        If you select Ignore, full validation pulls data in IN mode. This mode cannot validate scenarios where extra data exists in the destination, and validation performance is degraded to some extent.

      • Select the default value Stop Migration: If data exists in the destination table, an error is reported for the full migration. You must handle the data in the destination before you can continue the migration.

        Important

        If an error occurs and you click Resume, Data Transmission Service ignores this setting and continues to migrate table data. Proceed with caution.

    • Incremental synchronization

      The following parameters are displayed only if you select Incremental Synchronization on the Choose Sync Type page.

      image

      Parameter

      Description

      Write Concurrency

      This parameter specifies the number of concurrent threads for writing data to the destination during incremental synchronization. The maximum value is 512. A high concurrency can increase the load on the destination database and affect your business.

      Incremental Sync Rate Limit

      You can enable the incremental synchronization rate limit as needed. If you enable it, set the RPS and BPS. RPS is the maximum number of data rows that can be synchronized to the destination per second during incremental synchronization. BPS is the maximum volume of data that can be synchronized to the destination per second during incremental synchronization.

      Note

      The RPS and BPS settings only serve as a rate-limiting capability. The actual performance of incremental synchronization is affected by factors such as the source, the destination, and the instance specifications.

      Incremental Sync Start Offset

      When the source is PostgreSQL, you cannot set the start offset for incremental synchronization. The start offset defaults to the time when incremental synchronization starts.

    • Advanced options

      • The following parameters are displayed only if you do not select Schema Synchronization on the Choose Sync Type page.

        image

        Parameter

        Description

        Sync Data

        After you enable data synchronization, you can write specified data, such as the source schema or other custom field values, to a specified field in the destination.

        Sync Schema Name

        The program gets the source schema name and writes it as a string to the specified field in the destination table.

        When you enter the Specified Field Name, ensure that the field exists in the schema of all tables to be synchronized in the destination.

        Sync Custom Data

        You can define a custom field value and synchronize it as a string to a specified field in the destination tables.

        Important

        Data Transmission Service does not validate the field type for the Specified Field Name, nor does it validate the data or escape special characters for the Field Value. Ensure that the field type and value are valid.

      • The following parameters are displayed only if the destination OceanBase Database Oracle tenant is V4.3.0 or later and you select Schema Synchronization on the Choose Sync Type page.

        image

        The storage class options for destination table objects are Default, Row Store, Column Store, and Row and Column Store. This setting determines the storage class of the destination table objects during schema or incremental synchronization. For more information, see default_table_store_format.

        Note

        The Default option automatically adapts to other options based on the destination parameter settings. During schema synchronization, table objects are written to the corresponding schema based on the specified storage class.

  7. Click Precheck.

    In the Precheck step, Data Transmission Service verifies the connectivity between the source and the destination. If the precheck fails:

    • You can troubleshoot and resolve the issues, and then run the precheck again.

    • You can also click Skip in the Actions column for a failed precheck item. A dialog box appears and explains the specific impact of skipping this operation. To confirm skipping the item, click OK in the dialog box.

  8. After the precheck is successful, click Start Task.

    If you do not need to start the task immediately, you can click Save. You can then start the task manually from the Sync Task List page or using batch operations. For more information about batch operations, see Perform batch operations on data synchronization tasks. After the data synchronization task starts, it runs based on the selected synchronization types. For more information, see View synchronization details.

    DTS lets you reduce the number of sync objects while a task is running. For more information, see Reduce sync objects.

    Important

    You cannot add sync objects to a data synchronization task from an RDS PostgreSQL instance to an OceanBase Database Oracle tenant while the task is running.

If a data sync task fails, typically due to network connection issues or a slow process startup, you can click Resume on the task list or the product page.

References