Primary key

更新时间: 2026-03-26 22:18:32

Set a primary key (PK) on a table to enforce record uniqueness, enable high-performance UPSERT operations, and support millisecond-level point queries. This topic describes how to define a primary key when creating a table in Hologres.

In Hologres, the system physically maintains a dedicated primary key index file in the underlying storage layer—understanding this mechanism helps you make better table design decisions.

How it works

Hologres automatically maintains a primary key index file in the underlying storage layer. This file uses a row store structure to provide a high-speed key-value (KV) service. The key in the index file is the table's primary key, and the value consists of the Row Identifier (RID)—formerly unique_id—and the Clustering Key.

The RID is automatically generated and monotonically increases with each UPSERT operation. The primary key index file serves two purposes:

  • Conflict detection: The index enables efficient primary key conflict detection, enforcing uniqueness without a full table scan.

  • Data file lookup: Given a primary key value, Hologres uses the index to retrieve the RID and Clustering Key, then locates the corresponding data file directly.

Use cases

A primary key enables two performance-critical capabilities in real-time data warehouse scenarios:

  • High-performance UPSERT and DELETE: Hologres updates or deletes records by looking up the primary key in the index—no full table scan required. This enables high-throughput write updates while keeping data unique.

  • High-QPS point queries: For row-oriented tables, the primary key defaults to both the Clustering Key and Distribution Key. Use the primary key to locate data files directly, achieving point queries with millisecond-level latency—suitable for online use cases such as real-time risk control and real-time recommendations.

For a broader comparison of how primary keys interact with each storage format, see Table storage formats: Column-oriented, row-oriented, and hybrid row-column. For more information about the principles, see Hologres Online Service Capabilities Revealed.

Table type requirements

Primary key requirements vary by table storage format:

Table type PK required? Default clustering key / distribution key UPSERT supported?
Column-oriented No Yes, with PK
Row-oriented Yes Defaults to PK Yes
Hybrid row-column Yes Yes
Partitioned Depends on definition Depends on definition Yes, with PK

Usage notes

Choose fields with business meaning as the primary key. Do not use SERIAL type fields as a primary key: SERIAL acquires a table lock during writes, which degrades write performance, and the sequence can overflow as data grows.

Limitations

Constraint Detail
Uniqueness and nullability A primary key column (or every column in a composite primary key) must be unique and NOT NULL.
Composite primary key size Up to 32 columns.
Single-statement requirement Set multiple columns as a composite primary key in a single DDL statement only.
Supported data types Cannot use FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, JSONB, DATE, or other complex data types as a primary key. Hologres V1.3.22 and later support using DATE type fields as a primary key—check your instance version and upgrade if needed. See Instance details and Upgrade an instance.
Immutability A primary key cannot be modified after table creation. To change a primary key, recreate the table.

Set a primary key

The examples below apply to Hologres V2.1 and later, which use the WITH (property = 'value') syntax. For V2.0 and earlier, replace WITH (property = 'value') with the CALL set_table_property statement. For more information, see CREATE TABLE.

Column-oriented table with a single primary key

V2.1 and later:

CREATE TABLE tbl_1 (
    id bigint NOT NULL,
    name text NOT NULL,
    age bigint NOT NULL,
    class text,
    reg_timestamp timestamptz NOT NULL,
    PRIMARY KEY (id)
)
WITH (
    orientation = 'column',
    distribution_key = 'id',
    clustering_key = 'age',
    event_time_column = 'reg_timestamp',
    bitmap_columns = 'name,class',
    dictionary_encoding_columns = 'class:auto'
);

All versions:

BEGIN;
CREATE TABLE tbl_1 (
    id bigint NOT NULL,
    name text NOT NULL,
    age bigint,
    class text,
    reg_timestamp timestamptz,
    PRIMARY KEY (id)
);
CALL set_table_property('tbl_1', 'orientation', 'column');
CALL set_table_property('tbl_1', 'distribution_key', 'id');
CALL set_table_property('tbl_1', 'clustering_key', 'age');
CALL set_table_property('tbl_1', 'event_time_column', 'reg_timestamp');
CALL set_table_property('tbl_1', 'bitmap_columns', 'name,class');
CALL set_table_property('tbl_1', 'dictionary_encoding_columns', 'class:auto');
COMMIT;

Column-oriented table with a composite primary key

V2.1 and later:

CREATE TABLE tbl_1 (
    id bigint NOT NULL,
    name text NOT NULL,
    age bigint NOT NULL,
    class text NOT NULL,
    reg_timestamp timestamptz NOT NULL,
    PRIMARY KEY (id, age)
)
WITH (
    orientation = 'column',
    distribution_key = 'id',
    clustering_key = 'age',
    event_time_column = 'reg_timestamp',
    bitmap_columns = 'name,class',
    dictionary_encoding_columns = 'class:auto'
);

All versions:

BEGIN;
CREATE TABLE tbl_2 (
    id bigint NOT NULL,
    name text NOT NULL,
    age bigint NOT NULL,
    class text NOT NULL,
    reg_timestamp timestamptz NOT NULL,
    PRIMARY KEY (id, age)
);
CALL set_table_property('tbl_2', 'orientation', 'column');
CALL set_table_property('tbl_2', 'distribution_key', 'id');
CALL set_table_property('tbl_2', 'clustering_key', 'age');
CALL set_table_property('tbl_2', 'event_time_column', 'reg_timestamp');
CALL set_table_property('tbl_2', 'bitmap_columns', 'name,class');
CALL set_table_property('tbl_2', 'dictionary_encoding_columns', 'class:auto');
COMMIT;

Row-oriented table with a primary key

V2.1 and later:

CREATE TABLE public.tbl_row (
    id text NOT NULL,
    name text NOT NULL,
    class text,
    PRIMARY KEY (id)
)
WITH (
    orientation = 'row',
    distribution_key = 'id',
    clustering_key = 'id'
);

All versions:

BEGIN;
CREATE TABLE public.tbl_row (
    id text NOT NULL,
    name text NOT NULL,
    class text,
    PRIMARY KEY (id)
);
CALL set_table_property('public.tbl_row', 'orientation', 'row');
CALL set_table_property('public.tbl_row', 'clustering_key', 'id');
CALL set_table_property('public.tbl_row', 'distribution_key', 'id');
COMMIT;

Partitioned table with a primary key

V2.1 and later:

BEGIN;
CREATE TABLE public.tbl_parent (
    a text,
    b int,
    c timestamp,
    d text,
    ds text,
    PRIMARY KEY (ds, b)
)
PARTITION BY LIST(ds)
WITH (orientation = 'column');
CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN ('20221207');
CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN ('20221208');
COMMIT;

All versions:

BEGIN;
CREATE TABLE public.tbl_parent (
    a text,
    b int,
    c timestamp,
    d text,
    ds text,
    PRIMARY KEY (ds, b)
)
PARTITION BY LIST(ds);
CALL set_table_property('public.tbl_parent', 'orientation', 'column');
CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN ('20221207');
CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN ('20221208');
COMMIT;

What's next

上一篇: Best practices for table group setup 下一篇: Distribution key
阿里云首页 实时数仓 Hologres 相关技术圈