This topic describes how to use the FirstRow Merge Engine.
FirstRow Merge Engine description
Merge policy
For each primary key, only the first record is retained. Subsequent records with the same primary key are discarded.
Configuration
You can enable this engine by setting the table property
'table.merge-engine' = 'first_row'when you create a table.Changelog attributes
Generates an insert-only changelog.
Allows downstream Flink jobs to treat the primary key table as an append-only log table.
Scenarios
Suitable for downstream operations that do not require retractions or changelogs, such as window aggregation and interval join.
In stream processing, you can use this engine to remove duplicates from logs. This reduces processing complexity and improves efficiency.
Limits
Does not support UPDATE or DELETE operations.
Does not support partial updates.
UPDATE_BEFORE and DELETE events in the changelog are automatically ignored.
Example
-- Create table T, set the primary key to k, and enable the FirstRow Merge Engine.
CREATE TABLE T (
k INT,
v1 DOUBLE,
v2 STRING,
PRIMARY KEY (k) NOT ENFORCED
) WITH (
'table.merge-engine' = 'first_row'
);
-- Insert two records with the same primary key.
INSERT INTO T VALUES (1, 2.0, 't1');
INSERT INTO T VALUES (1, 3.0, 't2');
-- Query the record where the primary key is 1. Only the first record is returned.
SELECT * FROM T WHERE k = 1;
-- Output
-- +---+-----+------+
-- | k | v1 | v2 |
-- +---+-----+------+
-- | 1 | 2.0 | t1 |
-- +---+-----+------+