A Paimon catalog is an external catalog supported by StarRocks since version 3.1. Use it to query Apache Paimon data directly, create Paimon databases and tables, and write data back to Paimon—all from within StarRocks.
Creating Paimon databases and tables, and inserting data into Paimon tables, require StarRocks 3.2.9 or later.
A Paimon catalog in StarRocks maps one-to-one with a catalog in the native Paimon API. The configuration parameters and their meanings are identical.
Prerequisites
Before you begin, make sure your StarRocks cluster can reach both the storage system and the metadata service of your Paimon cluster. Supported combinations:
Metadata service | Storage backends |
Hive Metastore (HMS) | Hadoop Distributed File System (HDFS), Object Storage Service (OSS), OSS-HDFS |
File system | HDFS, OSS, OSS-HDFS |
Data Lake Formation (DLF) 1.0 (Legacy) | HDFS, OSS, OSS-HDFS |
Data Lake Formation (DLF) | HDFS, OSS, OSS-HDFS |
If you use DLF, configure a Resource Access Management (RAM) user and use it for all operations in StarRocks Manager. For details, see Use aDLFcatalog.
Create a Paimon catalog
Syntax
CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
"type" = "paimon",
CatalogParams,
StorageCredentialParams
);Parameters
Common parameters
Parameter | Required | Description |
| Yes | The name of the Paimon catalog. Must start with a letter and contain only letters (a–z, A–Z), digits (0–9), or underscores (\_). Maximum 64 characters. |
| No | A description of the Paimon catalog. |
| Yes | The data source type. Set to |
CatalogParams
CatalogParams specifies how StarRocks connects to the Paimon metadata service. Configuration varies by metadata type.
Use HMS
Property | Required | Description |
| Yes | Set to |
| Yes | The HMS URI. Format: |
Use filesystem
Property | Required | Description |
| Yes | Set to |
| Yes | The storage path of the warehouse where Paimon data is stored. Supported formats: |
Use DLF 1.0 (Legacy)
Property | Required | Description |
| Yes | Set to |
| Yes | The storage path of the warehouse. Format: |
| No | The ID of an existing data catalog in DLF. If omitted, the system uses the default DLF catalog. |
Use DLF
If you use DLF, use a configured RAM user to perform operations in StarRocks Manager. For details, see Use a DLF catalog.
Property | Required | Description |
| Yes | Set to |
| Yes | Set to |
| Yes | The REST API address of DLF. Format: |
| Yes | The name of the Paimon catalog. Find it on the Catalogs page in the Data Lake Formation console. |
| Yes | The REST service provider. Set to |
StorageCredentialParams
StorageCredentialParams specifies how StarRocks accesses the Paimon file storage. This parameter group is only required when using OSS or OSS-HDFS as storage. Skip it if you use HDFS.
"aliyun.oss.endpoint" = "<YourAliyunOSSEndpoint>"Property | Description |
| The endpoint of your OSS or OSS-HDFS storage. Find it on the Overview page of your bucket, under the Port section. See OSS regions and endpoints for region-specific values. Examples: |
After setting aliyun.oss.endpoint, go to the Parameter Configuration page in the EMR Serverless StarRocks console and update fs.oss.endpoint in core-site.xml and jindosdk.cfg to the same value.
Examples
Use HMS
CREATE EXTERNAL CATALOG paimon_catalog
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "hive",
"hive.metastore.uris" = "thrift://xx.xx.xx.xx:9083"
);Use filesystem
CREATE EXTERNAL CATALOG paimon_catalog
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "filesystem",
"paimon.catalog.warehouse" = "oss://<yourBucketName>/<yourPath>",
"aliyun.oss.endpoint" = "cn-hangzhou.oss-dls.aliyuncs.com"
);Use DLF 1.0 (Legacy)
CREATE EXTERNAL CATALOG paimon_catalog
PROPERTIES
(
"type" = "paimon",
"paimon.catalog.type" = "dlf",
"paimon.catalog.warehouse" = "oss://<yourBucketName>/<yourPath>",
"dlf.catalog.id" = "paimon_dlf_test"
);Use DLF
CREATE EXTERNAL CATALOG dlf_catalog
PROPERTIES
(
"type" = "paimon",
"uri" = "http://cn-hangzhou-vpc.dlf.aliyuncs.com",
"paimon.catalog.type" = "rest",
"paimon.catalog.warehouse" = "test_paimon",
"token.provider" = "dlf"
);View Paimon catalogs
List all catalogs in the current StarRocks cluster:
SHOW CATALOGS;View the creation statement of a specific catalog:
SHOW CREATE CATALOG paimon_catalog;Delete a Paimon catalog
DROP CATALOG paimon_catalog;Create a Paimon database
CREATE DATABASE IF NOT EXISTS <catalog_name>.<database_name>;Create a Paimon table
Syntax
CREATE TABLE IF NOT EXISTS <catalog_name>.<database_name>.<table_name>
(
id STRING,
name STRING,
day INT
)
PRIMARY KEY (id, day) -- Omit for append-only tables
PARTITION BY (`day`) -- Omit for non-partitioned tables
PROPERTIES ( -- Omit if no properties are needed
key = value
);PROPERTIES parameters
Key | Description |
| The data storage format. Default: |
| The number of buckets for data distribution and query optimization. Default: |
| The column used to distribute records into buckets. |
Limitations
For partitioned tables, all partition key columns must appear at the end of the column definition.
For partitioned tables, all partition key columns must be included in the primary key definition.
bucket_keycannot be part of a partition key or primary key.
Example
CREATE TABLE dlf_catalog.sr_dlf_db.ads_age_pvalue_analytics(
final_gender_code STRING COMMENT 'gender',
age_level STRING COMMENT 'age_level',
pvalue_level STRING COMMENT 'consumption_level',
clicks INT COMMENT 'clicks',
total_behaviors INT COMMENT 'total_behaviors'
);View a Paimon table schema
View the table schema:
DESC[RIBE] <catalog_name>.<database_name>.<table_name>;View the schema and file storage location from the CREATE TABLE statement:
SHOW CREATE TABLE <catalog_name>.<database_name>.<table_name>;Query data in a Paimon table
Switch to the target catalog and database, then run a SELECT statement:
-- Step 1: List databases in the catalog
SHOW DATABASES FROM <catalog_name>;
-- Step 2: Switch to the catalog and database
SET CATALOG <catalog_name>;
USE <db_name>;
-- Or switch directly to a specific database within the catalog
USE <catalog_name>.<db_name>;
-- Step 3: Query the table
SELECT count(*) FROM <table_name> LIMIT 10;Alternatively, specify the full path in the SELECT statement without switching context:
SELECT * FROM <catalog_name>.<database_name>.<table_name>;Write data to Paimon
Usage notes
StarRocks writes to Paimon tables in batches (Batch Write). Due to Paimon's limitations, you cannot write to tables whose bucket mode is HASH_DYNAMIC or CROSS_PARTITION. This includes primary key tables where the bucket property is not set—the default value is -1 (Dynamic Bucket Mode), which prevents write operations.
The Paimon software development kit (SDK) requires that partition key or primary key columns in the data being written are not null. StarRocks does not precheck data before writing, so an exception is thrown if a null value is encountered.
Write to a Paimon table
INSERT INTO <catalog_name>.<database_name>.<table_name> (column1, column2, ...) VALUES (value1, value2, ...);Example—insert rows into ads_age_pvalue_analytics:
INSERT INTO dlf_catalog.sr_dlf_db.ads_age_pvalue_analytics (final_gender_code, age_level, pvalue_level, clicks, total_behaviors)
VALUES
('M', '18-24', 'Low', 1500, 2500),
('F', '25-34', 'Medium', 2200, 3300),
('M', '35-44', 'High', 2800, 4000);Import Paimon data into an internal table
Transform and load data from a Paimon table into a StarRocks OLAP table:
INSERT INTO default_catalog.olap_db.olap_tbl SELECT * FROM <paimon_catalog>.<db_name>.<table_name>;What's next
For more information about Paimon, see Paimon overview.
For more information about using DLF 2.0, see Use a DLF catalog.