Migrate PolarDB-X 1.0 to Message Queue for Apache Kafka

更新时间:
复制 MD 格式

This topic describes how to use DTS to migrate data from a PolarDB-X 1.0 instance to an ApsaraMQ for Kafka instance.

Prerequisites

  • You have created a source PolarDB-X 1.0 instance. For more information, see Create an instance.

    Note

    The storage type of the source PolarDB-X 1.0 instance must be RDS for MySQL, including custom and separately purchased RDS instances. PolarDB for MySQL is not supported.

  • You have created a destination ApsaraMQ for Kafka instance and a topic in that instance to receive the migrated data. For more information about ApsaraMQ for Kafka, see Overview.

    Note

    See Migration solution overview for the versions supported by the source and destination instances.

  • The destination ApsaraMQ for Kafka instance must have more available storage space than the source PolarDB-X 1.0 instance's storage usage.

Precautions

Type

Description

Source database limitations

  • Tables to be migrated must have a PRIMARY KEY or a UNIQUE constraint, and the fields in the key or constraint must be unique. Otherwise, this may cause duplicate data in the destination database.

  • If you migrate objects at the table level and need to edit them, for example, by mapping column names, a single data migration task supports a maximum of 1,000 tables. If you exceed this limit, the task submission will fail. In this case, split the tables into multiple data migration tasks or configure a task to migrate the entire database.

  • For incremental data migration, the source database must meet the following binary log requirements:

    • Enable the binary log feature, and set the binlog_row_image parameter to full. Otherwise, the precheck fails and the data migration task cannot start.

    • For incremental data migration tasks, DTS requires that the binary logs of the source database be retained for at least 24 hours. For tasks that include both full data migration and incremental data migration, the binary logs must be retained for at least 7 days. You can change the retention period to 24 hours after the full data migration is complete. If the binary logs are not retained for the required period, DTS may fail to obtain them, which can cause task failures or even data inconsistency and loss. The DTS SLA does not cover issues caused by an insufficient binary log retention period.

  • Data migration from a read-only PolarDB-X 1.0 instance is not supported.

  • Limits on operations in the source database:

    • During data migration, do not perform operations such as scaling up or down, migrating hot tables, changing sharding keys, or executing DDL statements. Otherwise, the data migration task will fail.

      Note

      During the full migration phase, DTS queries the source database. This creates a metadata lock, which may block DDL operations on the source database.

    • During full and incremental data migration, DTS temporarily disables constraint checks and foreign key cascading operations at the session level. If you perform cascading update or delete operations on the source database while the task is running, data inconsistency may occur.

    • If you change the network type of the PolarDB-X 1.0 instance during migration, update the network connection information for the migration link.

    • If you perform only full data migration, do not write new data to the source database during the migration. Otherwise, data will become inconsistent between the source and destination databases. To maintain real-time data consistency, select schema migration, full data migration, and incremental data migration.

  • The version of the source PolarDB-X 1.0 instance must be 5.2 or later. For more information about how to view the version, see Instance versions.

Other limitations

  • Migration of INDEX, PARTITION, VIEW, PROCEDURE, FUNCTION, TRIGGER, and FK (foreign key) objects is not supported.

  • Perform data migration during off-peak hours. DTS consumes read and write resources on both the source and destination databases during full data migration, increasing the database load.

  • During full data migration, concurrent INSERT operations can cause table fragmentation in the destination database. As a result, after the full data migration is complete, tables in the destination database may occupy more storage space than in the source instance.

  • DTS attempts to resume a failed migration task within seven days. Therefore, before you switch your workloads to the destination instance, you must end or release the task, or run the revoke command to revoke the write permissions of the account that DTS uses to access the destination instance, to prevent the source data from overwriting the data in the destination instance if the task automatically resumes.

  • DTS relies on the continuity of XA transactions in the source PolarDB-X 1.0 instance to ensure data consistency for incremental migration tasks. If the continuity of XA transactions is disrupted, such as in a disaster recovery scenario for the incremental data collection module, uncommitted XA transactions may be lost.

  • If the destination ApsaraMQ for Kafka instance is scaled up or scaled down during data migration, you must restart the instance.

  • If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust its parameters.

    Note

    Only DTS task parameters are modified—not database parameters. Parameters that may be adjusted include those listed in Modify instance parameters.

Other precautions

  • DTS periodically updates the `dts_health_check`.`ha_health_check` table in the source database to advance the Binlog position.

Billing

Migration type

Configuration fee

Internet traffic fee

Schema migration and full data migration

Free.

Not charged in this example.

Note

An internet traffic fee applies if the Access Method for the destination database is Public IP Address. For more information, see billing overview.

Incremental data migration

Charged. For details, see the billing overview.

Migration types

  • Schema migration

    DTS migrates the schema definitions of the migration objects from the source database to the destination database.

  • Full migration

    DTS migrates all historical data of the specified migration objects from the source database to the destination database.

  • Incremental migration

    After a full migration is complete, DTS migrates incremental data updates from the source database to the destination database. Incremental migration lets you smoothly migrate data without interrupting your self-managed applications.

Supported SQL operations for incremental migration

Operation type

SQL statement

DML

INSERT, UPDATE, and DELETE

Database account permissions

Database

Schema migration

Full data migration

Incremental data migration

PolarDB-X instance

SELECT permission.

SELECT permission.

REPLICATION SLAVE and REPLICATION CLIENT permissions, and SELECT permission on the objects to be migrated.

Note

For more information about how to grant permissions, see Data synchronization tools for PolarDB-X.

Message Queue for Apache Kafka

Read and write permissions.

Data type mapping

For details, see Data Type Mappings for Initial Schema Synchronization.

Procedure

  1. Navigate to the migration task list page for the destination region using one of the following methods.

    From the DTS console

    1. Log on to the Data Transmission Service (DTS) console.

    2. In the navigation pane on the left, click Data Migration.

    3. In the upper-left corner of the page, select the region where the migration instance is located.

    From the DMS console

    Note

    The actual operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode console and Customize the layout and style of the DMS console.

    1. Log on to the Data Management (DMS) console.

    2. In the top menu bar, choose Data + AI > Data Transmission (DTS) > Data Migration.

    3. To the right of Data Migration Tasks, select the region where the migration instance is located.

  2. Click Create Task to navigate to the task configuration page.

  3. Configure the source and destination databases.

    Warning

    After selecting the source and destination instances, carefully read the Limits section at the top of the page. This helps ensure that the data migration task can be successfully created and run.

    Category

    Parameter

    Description

    N/A

    Task Name

    DTS automatically generates a task name. We recommend that you specify a descriptive name for easy identification. The name does not need to be unique.

    Source Database

    Select Existing Connection

    • To use a database instance that has been added to the system (created or saved), select the desired database instance from the drop-down list. The database information below will be automatically configured.

      Note

      In the DMS console, this parameter is named Select a DMS database instance..

    • If you have not registered the database instance with the system, or do not need to use a registered instance, manually configure the database information below.

    Database Type

    Select PolarDB-X 1.0.

    Connection Type

    Select Alibaba Cloud Instance.

    Instance Region

    Select the region where the source PolarDB-X 1.0 instance is located.

    Replicate Data Across Alibaba Cloud Accounts

    This example shows data migration within the same Alibaba Cloud account. Select No.

    Instance ID

    Select the ID of the source PolarDB-X 1.0 instance.

    Database Account

    Enter the database account for the source PolarDB-X 1.0 instance. Ensure the account has the required permissions.

    Database Password

    Enter the password for the database account.

    Destination Database

    Select Existing Connection

    • To use a database instance that has been added to the system (created or saved), select the desired database instance from the drop-down list. The database information below will be automatically configured.

      Note

      In the DMS console, this parameter is named Select a DMS database instance..

    • If you have not registered the database instance with the system, or do not need to use a registered instance, manually configure the database information below.

    Database Type

    Select Kafka.

    Connection Type

    Select Express Connect, VPN Gateway, or Smart Access Gateway.

    Note

    In this case, the Message Queue for Apache Kafka instance is treated as a self-managed Kafka instance for this data migration task.

    Instance Region

    Select the region where the destination Kafka project is located.

    Connected VPC

    Select the Basic Information of the destination Kafka instance. You can find the VPC ID on the Basic Information page of the Kafka instance.

    Domain Name or IP

    Enter an IP address from the Default Endpoint of the Kafka instance.

    Note

    You can find the IP address of the Default Endpoint on the Basic Information page of the Kafka instance.

    Port

    The service port of the Kafka instance. The default value is 9092.

    Database Account

    Enter the database account and password for the destination Kafka instance.

    Note

    These parameters are required only for Message Queue for Apache Kafka instances with Access Control List (ACL) enabled. For more information about how to enable ACL, see SASL user authorization.

    Database Password

    Kafka Version

    Select the version of your Kafka instance.

    Encryption

    Based on your business and security requirements, select Non-encrypted or SCRAM-SHA-256.

    Topic

    From the drop-down list, select the topic to receive the migrated data.

    Use Kafka Schema Registry

    Kafka Schema Registry provides a serving layer for your metadata. It offers a RESTful interface for storing and retrieving Avro schemas.

    • No: Does not use Kafka Schema Registry.

    • Yes: Uses Kafka Schema Registry. You must enter the URL or IP address for your Avro schema as registered in the Kafka Schema Registry.

  4. After you complete the configuration, click Test Connectivity and Proceed at the bottom of the page.

    Note
    • Ensure that the IP address segment of the DTS service is automatically or manually added to the security settings of the source and destination databases to allow access from DTS servers. For more information, see Add DTS server IP addresses to a whitelist.

    • If the source or destination database is a self-managed database (the Access Method is not Alibaba Cloud Instance), you must also click Test Connectivity in the CIDR Blocks of DTS Servers dialog box that appears.

  5. Configure the task objects.

    1. On the Configure Objects page, configure the objects that you want to migrate.

      Parameter

      Description

      Migration Types

      Select the migration types based on your requirements and the types supported by each engine.

      • If you only need to perform a full migration, select both Schema Migration and Full Data Migration.

      • To perform a migration with no downtime, select Schema Migration, Full Data Migration, and Incremental Data Migration.

      Note
      • If you do not select Schema Migration, you must ensure that a database and tables to receive the data exist in the destination database. You can also use the object name mapping feature in the Selected Objects box as needed.

      • If you do not select Incremental Data Migration, do not write new data to the source instance during data migration to ensure data consistency.

      Processing Mode for Existing Destination Tables

      • Precheck and Report Errors: Checks whether tables with the same names exist in the destination database. If no tables with the same names exist, the precheck is passed. If tables with the same names exist, an error is reported during the precheck, and the data migration task does not start.

        Note

        If a table in the destination database has the same name but cannot be easily deleted or renamed, you can change the name of the table in the destination database. For more information, see Object name mapping.

      • Ignore Errors and Proceed: Skips the check for tables with the same names.

        Warning

        Selecting Ignore Errors and Proceed may cause data inconsistency and business risks. For example:

        • If the table schemas are consistent and a record in the destination database has the same primary key value as a record in the source database:

          • During full migration, DTS keeps the record in the destination database. The record from the source database is not migrated.

          • During incremental migration, DTS does not keep the record in the destination database. The record from the source database overwrites the record in the destination database.

        • If the table schemas are inconsistent, only some columns of data may be migrated, or the migration may fail. Proceed with caution.

      Data Format in Kafka

      Select the data format for storing data in the destination Kafka instance.

      Note

      PolarDB-X 1.0 does not support Canal Json. You must select DTS Avro.

      Kafka Data Compression Format

      Select a compression format for Kafka messages based on your requirements.

      • LZ4 (default): low compression ratio, high compression speed.

      • GZIP: high compression ratio, low compression speed.

        Note

        High CPU consumption.

      • Snappy: medium compression ratio, medium compression speed.

      Policy for Shipping Data to Kafka Partitions

      This feature is not currently supported.

      Message acknowledgement mechanism

      Select a message acknowledgment mechanism based on your business requirements.

      Capitalization of Object Names in Destination Instance

      You can configure the case sensitivity policy for the names of migrated objects, such as databases, tables, and columns, in the destination instance. By default, DTS default policy is selected. You can also choose to keep the case sensitivity consistent with the default policy of the source or destination database. For more information, see Case sensitivity of object names in the destination database.

      Source Objects

      In the Source Objects box, click the objects to migrate, and then click Right arrow to move them to the Selected Objects box.

      Note

      Select tables as the migration objects. If you select an entire database, changes such as adding or deleting tables in that database are not migrated to the destination database.

      Selected Objects

      No additional configuration is required for this example. You can use the mapping feature to set the topic name, number of partitions, and partition key for the source table in the destination Kafka instance. For more information, see Map object names.

      Note
      • If you use the object name mapping feature, other objects that depend on the mapped object may fail to be migrated.

      • To select the SQL operations for incremental migration, right-click an object in the Selected Objects box and select the desired SQL operations in the dialog box that appears.

    2. Click Next: Advanced Settings to configure advanced parameters.

      Parameter

      Description

      Dedicated Cluster for Task Scheduling

      You do not need to select a dedicated cluster for this example. For more information about dedicated clusters, see What is a DTS dedicated cluster?.

      Retry Time for Failed Connections

      After the migration task starts, if the connection to the source or destination database fails, DTS reports an error and immediately begins to retry the connection. The default retry duration is 720 minutes. You can customize the retry time to a value from 10 to 1440 minutes. We recommend that you set the duration to more than 30 minutes. If DTS reconnects to the source and destination databases within the specified duration, the migration task automatically resumes. Otherwise, the task fails.

      Note
      • For multiple DTS instances that share the same source or destination, the network retry time is determined by the setting of the last created task.

      • Because you are charged for the task during the connection retry period, we recommend that you customize the retry time based on your business needs, or release the DTS instance as soon as possible after the source and destination database instances are released.

      Retry Time for Other Issues

      After the migration task starts, if a non-connectivity issue, such as a DDL or DML execution exception, occurs in the source or destination database, DTS reports an error and immediately begins to retry the operation. The default retry duration is 10 minutes. You can customize the retry time to a value from 1 to 1440 minutes. We recommend that you set the duration to more than 10 minutes. If the related operations succeed within the specified retry duration, the migration task automatically resumes. Otherwise, the task fails.

      Important

      The value of Retry Time for Other Issues must be less than the value of Retry Time for Failed Connections.

      Enable Throttling for Full Data Migration

      During full migration, DTS consumes read and write resources on the source and destination databases, which may increase the database load. If required, you can enable throttling for the full migration task. You can set Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s) to reduce the load on the destination database.

      Note
      • This configuration item is available only if you select Full Data Migration for Migration Types.

      • You can also adjust the full migration speed after the migration instance is running.

      Enable Throttling for Incremental Data Migration

      If required, you can also choose to set speed limits for the incremental migration task. You can set RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s) to reduce the load on the destination database.

      Note
      • This configuration item is available only if you select Incremental Data Migration for Migration Types.

      • You can also adjust the incremental migration speed after the migration instance is running.

      Environment Tag

      You can select an environment tag to identify the instance based on your requirements. This is not required for this example.

      Configure ETL

      Choose whether to enable the extract, transform, and load (ETL) feature. For more information, see What is ETL? Valid values:

      Monitoring and Alerting

      Select whether to set alerts and receive alert notifications based on your business needs.

      • No: Does not set an alert.

      • Yes: Configure alerts by setting an alert threshold and an alert contact. If a migration fails or the latency exceeds the threshold, the system sends an alert notification.

  6. Save the task and run a precheck.

    • To view the parameters for configuring this instance when you call the API operation, move the pointer over the Next: Save Task Settings and Precheck button and click Preview OpenAPI parameters in the bubble that appears.

    • If you do not need to view or have finished viewing the API parameters, click Next: Save Task Settings and Precheck at the bottom of the page.

    Note
    • Before the migration task starts, DTS performs a precheck. The task starts only after it passes the precheck.

    • If the precheck fails, click View Details next to the failed check item, fix the issue based on the prompt, and then run the precheck again.

    • If a warning is reported during the precheck:

      • For check items that cannot be ignored, click View Details next to the failed item, fix the issue based on the prompt, and then run the precheck again.

      • For check items that can be ignored, you can click Confirm Alert Details, Ignore, OK, and Precheck Again to skip the alert item and run the precheck again. If you choose to ignore a warning, it may cause issues such as data inconsistency and pose risks to your business.

  7. Purchase the instance.

    1. When the Success Rate is 100%, click Next: Purchase Instance.

    2. On the Purchase page, select the link specification for the data migration instance. For more information, see the following table.

      Category

      Parameter

      Description

      New Instance Class

      Resource Group Settings

      Select the resource group to which the instance belongs. The default value is default resource group. For more information, see What is Resource Management?

      Instance Class

      DTS provides migration specifications with different performance levels. The link specification affects the migration speed. You can select a specification based on your business scenario. For more information, see Data migration link specifications.

    3. After the configuration is complete, read and select Data Transmission Service (Pay-as-you-go) Service Terms.

    4. Click Buy and Start. In the OK dialog box that appears, click OK.

      You can view the progress of the migration task on the Data Migration Tasks list page.

      Note
      • If the migration task does not include incremental migration, it stops automatically after the full migration is complete. After the task stops, its Status changes to Completed.

      • If the migration task includes incremental migration, it does not stop automatically. The incremental migration task continues to run. While the incremental migration task is running, the Status of the task is Running.

Mapping information

  1. In the Selected Objects area, hover over the destination topic name at the table level.

  2. Click Edit next to the destination topic name.

  3. In the Edit Table dialog box, configure the mapping information.

    Note
    • The dialog box is named Edit Schema at the database level and Edit Table at the table level. The Edit Schema dialog box has fewer parameters.

    • If you do not migrate the entire database, you cannot modify the Name of target Topic and Number of Partitions parameters in the Edit Schema dialog box.

    Parameter

    Description

    Name of target Topic

    The name of the topic that receives data from the source table. This defaults to the Topic you selected in the Destination Database section during the Configurations for Source and Destination Databases step.

    Important
    • If the destination is a Message Queue for Apache Kafka instance, the specified topic must already exist in the destination instance. Otherwise, the data migration task fails. If the destination is a self-hosted Kafka database and the data migration task includes database and table structure migration, DTS attempts to create the specified topic in the destination database.

    • If you change the Name of target Topic, DTS writes data to the specified topic.

    Filter Conditions

    For more information, see Configure filter conditions.

    Number of Partitions

    The number of partitions in the destination topic.

    Partition Key

    This parameter is available when you set Policy for Shipping Data to Kafka Partitions to Ship Data to Separate Partitions Based on Hash Values of Primary Keys. You can specify one or more columns as the partition key. DTS then calculates a hash value for the specified columns and uses it to distribute rows across partitions in the destination topic.

    Note

    You can select Partition Key only in the Edit Table dialog box.

  4. Click OK.