When Data Integration synchronizes data in real time from sources such as MySQL, Loghub, and PolarDB to DataHub or Kafka, it adds five metadata fields to every record in the destination. These fields support metadata management, sorting, and deduplication in downstream processing logic.
Record format
Each synchronized record contains the five metadata fields plus the actual data fields from the source table. The following example shows the record format for MySQL real-time synchronization.
_sequence_id_ |
_operation_type_ |
_execute_time_ |
_before_image_ |
_after_image_ |
Field 1 | Field 2 | Field 3 |
|---|---|---|---|---|---|---|---|
| The record ID of an incremental event. The value is unique and auto-incrementing. | The operation type (I, D, or U). | The timestamp of the data. | Indicates whether the record contains the data before the change (Y/N). | Indicates whether the record contains the data after the change (Y/N). | Actual data field 1. | Actual data field 2. | Actual data field 3. |
Metadata field reference
| Field | Type | Description |
|---|---|---|
_sequence_id_ |
STRING | A unique, auto-incrementing ID assigned to each incremental event. Use this field to merge full data and incremental data and to order records. |
_operation_type_ |
STRING | The type of database operation. Valid values: I (INSERT), D (DELETE), U (UPDATE). |
_execute_time_ |
LONG | The binlog timestamp when the source database generated the change event. |
_before_image_ |
STRING | Whether the record holds the pre-change version of the row. Valid values: Y and N. Use this field together with _after_image_ to identify which version of a row a record represents. |
_after_image_ |
STRING | Whether the record holds the post-change version of the row. Valid values: Y and N. Use this field together with _before_image_ to identify which version of a row a record represents. |
Field values by operation type
The _before_image_ and _after_image_ field values differ by operation type. For UPDATE operations, Data Integration splits one source change into two records — one for the original row and one for the updated row. Both records share the same _sequence_id_, _operation_type_, and _execute_time_ values.
| Operation | Records generated | _before_image_ |
_after_image_ |
Row content |
|---|---|---|---|---|
| INSERT | 1 | N |
Y |
The inserted row. |
| UPDATE (original row) | 2 (paired) | Y |
N |
The row data before the update. |
| UPDATE (updated row) | 2 (paired) | N |
Y |
The row data after the update. |
| DELETE | 1 | Y |
N |
The deleted row. |