Fields used for real-time synchronization

更新时间:
复制 MD 格式

When Data Integration synchronizes data in real time from sources such as MySQL, Loghub, and PolarDB to DataHub or Kafka, it adds five metadata fields to every record in the destination. These fields support metadata management, sorting, and deduplication in downstream processing logic.

Record format

Each synchronized record contains the five metadata fields plus the actual data fields from the source table. The following example shows the record format for MySQL real-time synchronization.

_sequence_id_ _operation_type_ _execute_time_ _before_image_ _after_image_ Field 1 Field 2 Field 3
The record ID of an incremental event. The value is unique and auto-incrementing. The operation type (I, D, or U). The timestamp of the data. Indicates whether the record contains the data before the change (Y/N). Indicates whether the record contains the data after the change (Y/N). Actual data field 1. Actual data field 2. Actual data field 3.

Metadata field reference

Field Type Description
_sequence_id_ STRING A unique, auto-incrementing ID assigned to each incremental event. Use this field to merge full data and incremental data and to order records.
_operation_type_ STRING The type of database operation. Valid values: I (INSERT), D (DELETE), U (UPDATE).
_execute_time_ LONG The binlog timestamp when the source database generated the change event.
_before_image_ STRING Whether the record holds the pre-change version of the row. Valid values: Y and N. Use this field together with _after_image_ to identify which version of a row a record represents.
_after_image_ STRING Whether the record holds the post-change version of the row. Valid values: Y and N. Use this field together with _before_image_ to identify which version of a row a record represents.

Field values by operation type

The _before_image_ and _after_image_ field values differ by operation type. For UPDATE operations, Data Integration splits one source change into two records — one for the original row and one for the updated row. Both records share the same _sequence_id_, _operation_type_, and _execute_time_ values.

Operation Records generated _before_image_ _after_image_ Row content
INSERT 1 N Y The inserted row.
UPDATE (original row) 2 (paired) Y N The row data before the update.
UPDATE (updated row) 2 (paired) N Y The row data after the update.
DELETE 1 Y N The deleted row.