Single-table real-time sync

更新时间:
复制 MD 格式

DataWorks real-time sync keeps a destination database consistent with the source through continuous single-table or full-database synchronization.

Core capabilities

Real-time sync capabilities:

image

Capability

Description

Multi-source data sync

Combine different source and destination data sources into sync pipelines. Supported data sources and sync solutions.

Complex network environments

Supports Alibaba Cloud databases, on-premises databases, ECS self-managed databases, and databases from other cloud providers. Ensure network connectivity between your resource group and both the source and destination. Network connectivity solutions.

Sync scenarios

Two primary scenarios: single source table to single destination table, and incremental data from sharded databases and tables to a single destination table.

  • Data Integration & Data Studio (New): Wizard-based interface for single-table ETL sync. Includes data processing, data sampling, simulation runs, and advanced parameter settings.

  • Data Studio (Legacy): Drag-and-drop interface for single-table ETL sync. Supports data filtering, string replacement, and data masking.

Task configuration

Configure code-free, real-time ETL pipelines for single tables. Configure a real-time sync task for a single table.

Single-table real-time sync:

  • Configuration method: Low-code development through a drag-and-drop GUI or wizard. No coding required.

  • Field mapping: Same-name mapping, same-order mapping, and custom field relationships. For unmatched source fields, specify a policy to add a column, ignore the field, or report an error. Assign values to destination fields dynamically using a constant, variable, or function.

  • Data processing: Use Data Filtering, Replace String, Data Masking, and JSON Parsing to process source data and write it to the destination database.

  • Code debugging: Sample source data, view intermediate results at each step, and simulate final output with a Dry Run. A Dry Run does not write to the destination table, protecting production data during debugging.

Task O&M

Monitor and alert on sync task status.

  • Breakpoint resumption: If a task is interrupted, specify a time point to resume from and ensure data integrity.

  • Configure alerts for business latency, failover, DDL policies, and heartbeat detection. O&M for real-time sync tasks.

  • Alert notifications are sent via email, SMS, phone, or DingTalk to help you quickly identify task exceptions.

  • To prevent alert fatigue, set rules to throttle notifications within a specified interval.

  • The heartbeat alert feature is automatically enabled when a task starts and disabled when it stops. Manual settings for this feature are retained.

Note
  • Real-time sync tasks cannot run from Data Studio. Save and submit the node, then run it in Operation Center in the production environment.

  • Real-time sync does not support synchronizing views.

Supported data sources

Source: Kafka, Hologres, Oracle, LogHub, and DataHub.

Destination: ApsaraDB for OceanBase, Data Lake Formation (DLF), Doris, Hologres, Kafka, MaxCompute, OSS, OSS-HDFS, StarRocks, Tablestore, and Lindorm.

Data processing: data filtering, string replacement, data masking, JSON parsing, and field editing and assignment.

Get started

Create your first task: Configure a real-time sync task for a single table.

FAQ

FAQ about real-time sync.