Primary key
Set a primary key (PK) on a table to enforce record uniqueness, enable high-performance UPSERT operations, and support millisecond-level point queries. This topic describes how to define a primary key when creating a table in Hologres.
In Hologres, the system physically maintains a dedicated primary key index file in the underlying storage layer—understanding this mechanism helps you make better table design decisions.
How it works
Hologres automatically maintains a primary key index file in the underlying storage layer. This file uses a row store structure to provide a high-speed key-value (KV) service. The key in the index file is the table's primary key, and the value consists of the Row Identifier (RID)—formerly unique_id—and the Clustering Key.
The RID is automatically generated and monotonically increases with each UPSERT operation. The primary key index file serves two purposes:
-
Conflict detection: The index enables efficient primary key conflict detection, enforcing uniqueness without a full table scan.
-
Data file lookup: Given a primary key value, Hologres uses the index to retrieve the RID and Clustering Key, then locates the corresponding data file directly.
Use cases
A primary key enables two performance-critical capabilities in real-time data warehouse scenarios:
-
High-performance UPSERT and DELETE: Hologres updates or deletes records by looking up the primary key in the index—no full table scan required. This enables high-throughput write updates while keeping data unique.
-
High-QPS point queries: For row-oriented tables, the primary key defaults to both the Clustering Key and Distribution Key. Use the primary key to locate data files directly, achieving point queries with millisecond-level latency—suitable for online use cases such as real-time risk control and real-time recommendations.
For a broader comparison of how primary keys interact with each storage format, see Table storage formats: Column-oriented, row-oriented, and hybrid row-column. For more information about the principles, see Hologres Online Service Capabilities Revealed.
Table type requirements
Primary key requirements vary by table storage format:
| Table type | PK required? | Default clustering key / distribution key | UPSERT supported? |
|---|---|---|---|
| Column-oriented | No | — | Yes, with PK |
| Row-oriented | Yes | Defaults to PK | Yes |
| Hybrid row-column | Yes | — | Yes |
| Partitioned | Depends on definition | Depends on definition | Yes, with PK |
Usage notes
Choose fields with business meaning as the primary key. Do not use SERIAL type fields as a primary key: SERIAL acquires a table lock during writes, which degrades write performance, and the sequence can overflow as data grows.
Limitations
| Constraint | Detail |
|---|---|
| Uniqueness and nullability | A primary key column (or every column in a composite primary key) must be unique and NOT NULL. |
| Composite primary key size | Up to 32 columns. |
| Single-statement requirement | Set multiple columns as a composite primary key in a single DDL statement only. |
| Supported data types | Cannot use FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, JSONB, DATE, or other complex data types as a primary key. Hologres V1.3.22 and later support using DATE type fields as a primary key—check your instance version and upgrade if needed. See Instance details and Upgrade an instance. |
| Immutability | A primary key cannot be modified after table creation. To change a primary key, recreate the table. |
Set a primary key
The examples below apply to Hologres V2.1 and later, which use the WITH (property = 'value') syntax. For V2.0 and earlier, replace WITH (property = 'value') with the CALL set_table_property statement. For more information, see CREATE TABLE.
Column-oriented table with a single primary key
V2.1 and later:
CREATE TABLE tbl_1 (
id bigint NOT NULL,
name text NOT NULL,
age bigint NOT NULL,
class text,
reg_timestamp timestamptz NOT NULL,
PRIMARY KEY (id)
)
WITH (
orientation = 'column',
distribution_key = 'id',
clustering_key = 'age',
event_time_column = 'reg_timestamp',
bitmap_columns = 'name,class',
dictionary_encoding_columns = 'class:auto'
);
All versions:
BEGIN;
CREATE TABLE tbl_1 (
id bigint NOT NULL,
name text NOT NULL,
age bigint,
class text,
reg_timestamp timestamptz,
PRIMARY KEY (id)
);
CALL set_table_property('tbl_1', 'orientation', 'column');
CALL set_table_property('tbl_1', 'distribution_key', 'id');
CALL set_table_property('tbl_1', 'clustering_key', 'age');
CALL set_table_property('tbl_1', 'event_time_column', 'reg_timestamp');
CALL set_table_property('tbl_1', 'bitmap_columns', 'name,class');
CALL set_table_property('tbl_1', 'dictionary_encoding_columns', 'class:auto');
COMMIT;
Column-oriented table with a composite primary key
V2.1 and later:
CREATE TABLE tbl_1 (
id bigint NOT NULL,
name text NOT NULL,
age bigint NOT NULL,
class text NOT NULL,
reg_timestamp timestamptz NOT NULL,
PRIMARY KEY (id, age)
)
WITH (
orientation = 'column',
distribution_key = 'id',
clustering_key = 'age',
event_time_column = 'reg_timestamp',
bitmap_columns = 'name,class',
dictionary_encoding_columns = 'class:auto'
);
All versions:
BEGIN;
CREATE TABLE tbl_2 (
id bigint NOT NULL,
name text NOT NULL,
age bigint NOT NULL,
class text NOT NULL,
reg_timestamp timestamptz NOT NULL,
PRIMARY KEY (id, age)
);
CALL set_table_property('tbl_2', 'orientation', 'column');
CALL set_table_property('tbl_2', 'distribution_key', 'id');
CALL set_table_property('tbl_2', 'clustering_key', 'age');
CALL set_table_property('tbl_2', 'event_time_column', 'reg_timestamp');
CALL set_table_property('tbl_2', 'bitmap_columns', 'name,class');
CALL set_table_property('tbl_2', 'dictionary_encoding_columns', 'class:auto');
COMMIT;
Row-oriented table with a primary key
V2.1 and later:
CREATE TABLE public.tbl_row (
id text NOT NULL,
name text NOT NULL,
class text,
PRIMARY KEY (id)
)
WITH (
orientation = 'row',
distribution_key = 'id',
clustering_key = 'id'
);
All versions:
BEGIN;
CREATE TABLE public.tbl_row (
id text NOT NULL,
name text NOT NULL,
class text,
PRIMARY KEY (id)
);
CALL set_table_property('public.tbl_row', 'orientation', 'row');
CALL set_table_property('public.tbl_row', 'clustering_key', 'id');
CALL set_table_property('public.tbl_row', 'distribution_key', 'id');
COMMIT;
Partitioned table with a primary key
V2.1 and later:
BEGIN;
CREATE TABLE public.tbl_parent (
a text,
b int,
c timestamp,
d text,
ds text,
PRIMARY KEY (ds, b)
)
PARTITION BY LIST(ds)
WITH (orientation = 'column');
CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN ('20221207');
CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN ('20221208');
COMMIT;
All versions:
BEGIN;
CREATE TABLE public.tbl_parent (
a text,
b int,
c timestamp,
d text,
ds text,
PRIMARY KEY (ds, b)
)
PARTITION BY LIST(ds);
CALL set_table_property('public.tbl_parent', 'orientation', 'column');
CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN ('20221207');
CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN ('20221208');
COMMIT;
What's next
-
For guidance on setting table properties based on your query scenario, see Scenario-based guide for table creation and optimization.
-
For best practices in key-value (KV) query scenarios, see Best practices for Key/Value query scenarios.
-
For DDL statements for Hologres internal tables, see CREATE TABLE. Related statements include CREATE TABLE AS, CREATE TABLE LIKE, ALTER TABLE, and DROP TABLE.